aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/net/bonding
AgeCommit message (Collapse)Author
2021-11-26bonding: Fix a use-after-free problem when bond_sysfs_slave_add() failedHuang Guobin
[ Upstream commit b93c6a911a3fe926b00add28f3b932007827c4ca ] When I do fuzz test for bonding device interface, I got the following use-after-free Calltrace: ================================================================== BUG: KASAN: use-after-free in bond_enslave+0x1521/0x24f0 Read of size 8 at addr ffff88825bc11c00 by task ifenslave/7365 CPU: 5 PID: 7365 Comm: ifenslave Tainted: G E 5.15.0-rc1+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 Call Trace: dump_stack_lvl+0x6c/0x8b print_address_description.constprop.0+0x48/0x70 kasan_report.cold+0x82/0xdb __asan_load8+0x69/0x90 bond_enslave+0x1521/0x24f0 bond_do_ioctl+0x3e0/0x450 dev_ifsioc+0x2ba/0x970 dev_ioctl+0x112/0x710 sock_do_ioctl+0x118/0x1b0 sock_ioctl+0x2e0/0x490 __x64_sys_ioctl+0x118/0x150 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f19159cf577 Code: b3 66 90 48 8b 05 11 89 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 78 RSP: 002b:00007ffeb3083c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffeb3084bca RCX: 00007f19159cf577 RDX: 00007ffeb3083ce0 RSI: 0000000000008990 RDI: 0000000000000003 RBP: 00007ffeb3084bc4 R08: 0000000000000040 R09: 0000000000000000 R10: 00007ffeb3084bc0 R11: 0000000000000246 R12: 00007ffeb3083ce0 R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffeb3083cb0 Allocated by task 7365: kasan_save_stack+0x23/0x50 __kasan_kmalloc+0x83/0xa0 kmem_cache_alloc_trace+0x22e/0x470 bond_enslave+0x2e1/0x24f0 bond_do_ioctl+0x3e0/0x450 dev_ifsioc+0x2ba/0x970 dev_ioctl+0x112/0x710 sock_do_ioctl+0x118/0x1b0 sock_ioctl+0x2e0/0x490 __x64_sys_ioctl+0x118/0x150 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Freed by task 7365: kasan_save_stack+0x23/0x50 kasan_set_track+0x20/0x30 kasan_set_free_info+0x24/0x40 __kasan_slab_free+0xf2/0x130 kfree+0xd1/0x5c0 slave_kobj_release+0x61/0x90 kobject_put+0x102/0x180 bond_sysfs_slave_add+0x7a/0xa0 bond_enslave+0x11b6/0x24f0 bond_do_ioctl+0x3e0/0x450 dev_ifsioc+0x2ba/0x970 dev_ioctl+0x112/0x710 sock_do_ioctl+0x118/0x1b0 sock_ioctl+0x2e0/0x490 __x64_sys_ioctl+0x118/0x150 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Last potentially related work creation: kasan_save_stack+0x23/0x50 kasan_record_aux_stack+0xb7/0xd0 insert_work+0x43/0x190 __queue_work+0x2e3/0x970 delayed_work_timer_fn+0x3e/0x50 call_timer_fn+0x148/0x470 run_timer_softirq+0x8a8/0xc50 __do_softirq+0x107/0x55f Second to last potentially related work creation: kasan_save_stack+0x23/0x50 kasan_record_aux_stack+0xb7/0xd0 insert_work+0x43/0x190 __queue_work+0x2e3/0x970 __queue_delayed_work+0x130/0x180 queue_delayed_work_on+0xa7/0xb0 bond_enslave+0xe25/0x24f0 bond_do_ioctl+0x3e0/0x450 dev_ifsioc+0x2ba/0x970 dev_ioctl+0x112/0x710 sock_do_ioctl+0x118/0x1b0 sock_ioctl+0x2e0/0x490 __x64_sys_ioctl+0x118/0x150 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88825bc11c00 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 0 bytes inside of 1024-byte region [ffff88825bc11c00, ffff88825bc12000) The buggy address belongs to the page: page:ffffea00096f0400 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25bc10 head:ffffea00096f0400 order:3 compound_mapcount:0 compound_pincount:0 flags: 0x57ff00000010200(slab|head|node=1|zone=2|lastcpupid=0x7ff) raw: 057ff00000010200 ffffea0009a71c08 ffff888240001968 ffff88810004dbc0 raw: 0000000000000000 00000000000a000a 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88825bc11b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88825bc11b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88825bc11c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88825bc11c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88825bc11d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Put new_slave in bond_sysfs_slave_add() will cause use-after-free problems when new_slave is accessed in the subsequent error handling process. Since new_slave will be put in the subsequent error handling process, remove the unnecessary put to fix it. In addition, when sysfs_create_file() fails, if some files have been crea- ted successfully, we need to call sysfs_remove_file() to remove them. Since there are sysfs_create_files() & sysfs_remove_files() can be used, use these two functions instead. Fixes: 7afcaec49696 (bonding: use kobject_put instead of _del after kobject_add) Signed-off-by: Huang Guobin <huangguobin4@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-16bonding: init notify_work earlier to avoid uninitialized useJohannes Berg
[ Upstream commit 35d96e631860226d5dc4de0fad0a415362ec2457 ] If bond_kobj_init() or later kzalloc() in bond_alloc_slave() fail, then we call kobject_put() on the slave->kobj. This in turn calls the release function slave_kobj_release() which will always try to cancel_delayed_work_sync(&slave->notify_work), which shouldn't be done on an uninitialized work struct. Always initialize the work struct earlier to avoid problems here. Syzbot bisected this down to a completely pointless commit, some fault injection may have been at work here that caused the alloc failure in the first place, which may interact badly with bisect. Reported-by: syzbot+bfda097c12a00c8cae67@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-08bonding: wait for sysfs kobject destruction before freeing struct slaveJamie Iles
[ Upstream commit b9ad3e9f5a7a760ab068e33e1f18d240ba32ce92 ] syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a struct slave device could result in the following splat: kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000) bond0 (unregistering): (slave bond_slave_1): Releasing backup interface ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600 WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S 5.9.0-rc8+ #96 Hardware name: linux,dummy-virt (DT) Workqueue: netns cleanup_net Call trace: dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239 show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x174/0x1f8 lib/dump_stack.c:118 panic+0x360/0x7a0 kernel/panic.c:231 __warn+0x244/0x2ec kernel/panic.c:600 report_bug+0x240/0x398 lib/bug.c:198 bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974 call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322 brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329 do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864 el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65 el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93 el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594 debug_print_object+0x180/0x240 lib/debugobjects.c:485 __debug_check_no_obj_freed lib/debugobjects.c:967 [inline] debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998 slab_free_hook mm/slub.c:1536 [inline] slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577 slab_free mm/slub.c:3138 [inline] kfree+0x13c/0x460 mm/slub.c:4119 bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492 __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190 bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline] bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420 notifier_call_chain+0xf0/0x200 kernel/notifier.c:83 __raw_notifier_call_chain kernel/notifier.c:361 [inline] raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368 call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033 call_netdevice_notifiers_extack net/core/dev.c:2045 [inline] call_netdevice_notifiers net/core/dev.c:2059 [inline] rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347 unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509 unregister_netdevice_many net/core/dev.c:10508 [inline] default_device_exit_batch+0x294/0x338 net/core/dev.c:10992 ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189 cleanup_net+0x44c/0x888 net/core/net_namespace.c:603 process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269 worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415 kthread+0x390/0x498 kernel/kthread.c:292 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925 This is a potential use-after-free if the sysfs nodes are being accessed whilst removing the struct slave, so wait for the object destruction to complete before freeing the struct slave itself. Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") Fixes: a068aab42258 ("bonding: Fix reference count leak in bond_sysfs_slave_add.") Cc: Qiushi Wu <wu000273@umn.edu> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jamie Iles <jamie@nuviainc.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14bonding: set dev->needed_headroom in bond_setup_by_slave()Eric Dumazet
[ Upstream commit f32f19339596b214c208c0dba716f4b6cc4f6958 ] syzbot managed to crash a host by creating a bond with a GRE device. For non Ethernet device, bonding calls bond_setup_by_slave() instead of ether_setup(), and unfortunately dev->needed_headroom was not copied from the new added member. [ 171.243095] skbuff: skb_under_panic: text:ffffffffa184b9ea len:116 put:20 head:ffff883f84012dc0 data:ffff883f84012dbc tail:0x70 end:0xd00 dev:bond0 [ 171.243111] ------------[ cut here ]------------ [ 171.243112] kernel BUG at net/core/skbuff.c:112! [ 171.243117] invalid opcode: 0000 [#1] SMP KASAN PTI [ 171.243469] gsmi: Log Shutdown Reason 0x03 [ 171.243505] Call Trace: [ 171.243506] <IRQ> [ 171.243512] [<ffffffffa171be59>] skb_push+0x49/0x50 [ 171.243516] [<ffffffffa184b9ea>] ipgre_header+0x2a/0xf0 [ 171.243520] [<ffffffffa17452d7>] neigh_connected_output+0xb7/0x100 [ 171.243524] [<ffffffffa186f1d3>] ip6_finish_output2+0x383/0x490 [ 171.243528] [<ffffffffa186ede2>] __ip6_finish_output+0xa2/0x110 [ 171.243531] [<ffffffffa186acbc>] ip6_finish_output+0x2c/0xa0 [ 171.243534] [<ffffffffa186abe9>] ip6_output+0x69/0x110 [ 171.243537] [<ffffffffa186ac90>] ? ip6_output+0x110/0x110 [ 171.243541] [<ffffffffa189d952>] mld_sendpack+0x1b2/0x2d0 [ 171.243544] [<ffffffffa189d290>] ? mld_send_report+0xf0/0xf0 [ 171.243548] [<ffffffffa189c797>] mld_ifc_timer_expire+0x2d7/0x3b0 [ 171.243551] [<ffffffffa189c4c0>] ? mld_gq_timer_expire+0x50/0x50 [ 171.243556] [<ffffffffa0fea270>] call_timer_fn+0x30/0x130 [ 171.243559] [<ffffffffa0fea17c>] expire_timers+0x4c/0x110 [ 171.243563] [<ffffffffa0fea0e3>] __run_timers+0x213/0x260 [ 171.243566] [<ffffffffa0fecb7d>] ? ktime_get+0x3d/0xa0 [ 171.243570] [<ffffffffa0ff9c4e>] ? clockevents_program_event+0x7e/0xe0 [ 171.243574] [<ffffffffa0f7e5d5>] ? sched_clock_cpu+0x15/0x190 [ 171.243577] [<ffffffffa0fe973d>] run_timer_softirq+0x1d/0x40 [ 171.243581] [<ffffffffa1c00152>] __do_softirq+0x152/0x2f0 [ 171.243585] [<ffffffffa0f44e1f>] irq_exit+0x9f/0xb0 [ 171.243588] [<ffffffffa1a02e1d>] smp_apic_timer_interrupt+0xfd/0x1a0 [ 171.243591] [<ffffffffa1a01ea6>] apic_timer_interrupt+0x86/0x90 Fixes: f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26bonding: fix active-backup failover for current ARP slaveJiri Wiesner
[ Upstream commit 0410d07190961ac526f05085765a8d04d926545b ] When the ARP monitor is used for link detection, ARP replies are validated for all slaves (arp_validate=3) and fail_over_mac is set to active, two slaves of an active-backup bond may get stuck in a state where both of them are active and pass packets that they receive to the bond. This state makes IPv6 duplicate address detection fail. The state is reached thus: 1. The current active slave goes down because the ARP target is not reachable. 2. The current ARP slave is chosen and made active. 3. A new slave is enslaved. This new slave becomes the current active slave and can reach the ARP target. As a result, the current ARP slave stays active after the enslave action has finished and the log is littered with "PROBE BAD" messages: > bond0: PROBE: c_arp ens10 && cas ens11 BAD The workaround is to remove the slave with "going back" status from the bond and re-enslave it. This issue was encountered when DPDK PMD interfaces were being enslaved to an active-backup bond. I would be possible to fix the issue in bond_enslave() or bond_change_active_slave() but the ARP monitor was fixed instead to keep most of the actions changing the current ARP slave in the ARP monitor code. The current ARP slave is set as inactive and backup during the commit phase. A new state, BOND_LINK_FAIL, has been introduced for slaves in the context of the ARP monitor. This allows administrators to see how slaves are rotated for sending ARP requests and attempts are made to find a new active slave. Fixes: b2220cad583c9 ("bonding: refactor ARP active-backup monitor") Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26bonding: fix a potential double-unregisterCong Wang
[ Upstream commit 832707021666411d04795c564a4adea5d6b94f17 ] When we tear down a network namespace, we unregister all the netdevices within it. So we may queue a slave device and a bonding device together in the same unregister queue. If the only slave device is non-ethernet, it would automatically unregister the bonding device as well. Thus, we may end up unregistering the bonding device twice. Workaround this special case by checking reg_state. Fixes: 9b5e383c11b0 ("net: Introduce unregister_netdevice_many()") Reported-by: syzbot+af23e7f3e0a7e10c8b67@syzkaller.appspotmail.com Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-26bonding: show saner speed for broadcast modeJarod Wilson
[ Upstream commit 4ca0d9ac3fd8f9f90b72a15d8da2aca3ffb58418 ] Broadcast mode bonds transmit a copy of all traffic simultaneously out of all interfaces, so the "speed" of the bond isn't really the aggregate of all interfaces, but rather, the speed of the slowest active interface. Also, the type of the speed field is u32, not unsigned long, so adjust that accordingly, as required to make min() function here without complaining about mismatching types. Fixes: bb5b052f751b ("bond: add support to read speed and duplex via ethtool") CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29bonding: check return value of register_netdevice() in bond_newlink()Cong Wang
[ Upstream commit c75d1d5248c0c97996051809ad0e9f154ba5d76e ] Very similar to commit 544f287b8495 ("bonding: check error value of register_netdevice() immediately"), we should immediately check the return value of register_netdevice() before doing anything else. Fixes: 005db31d5f5f ("bonding: set carrier off for devices created through netlink") Reported-and-tested-by: syzbot+bbc3a11c4da63c1b74d6@syzkaller.appspotmail.com Cc: Beniamino Galvani <bgalvani@redhat.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-29bonding: check error value of register_netdevice() immediatelyTaehee Yoo
[ Upstream commit 544f287b84959203367cd29e16e772717612fab4 ] If register_netdevice() is failed, net_device should not be used because variables are uninitialized or freed. So, the routine should be stopped immediately. But, bond_create() doesn't check return value of register_netdevice() immediately. That will result in a panic because of using uninitialized or freed memory. Test commands: modprobe netdev-notifier-error-inject echo -22 > /sys/kernel/debug/notifier-error-inject/netdev/\ actions/NETDEV_REGISTER/error modprobe bonding max_bonds=3 Splat looks like: [ 375.028492][ T193] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] SMP DEBUG_PAGEALLOC PTI [ 375.033207][ T193] CPU: 2 PID: 193 Comm: kworker/2:2 Not tainted 5.8.0-rc4+ #645 [ 375.036068][ T193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 375.039673][ T193] Workqueue: events linkwatch_event [ 375.041557][ T193] RIP: 0010:dev_activate+0x4a/0x340 [ 375.043381][ T193] Code: 40 a8 04 0f 85 db 00 00 00 8b 83 08 04 00 00 85 c0 0f 84 0d 01 00 00 31 d2 89 d0 48 8d 04 40 48 c1 e0 07 48 03 83 00 04 00 00 <48> 8b 48 10 f6 41 10 01 75 08 f0 80 a1 a0 01 00 00 fd 48 89 48 08 [ 375.050267][ T193] RSP: 0018:ffff9f8facfcfdd8 EFLAGS: 00010202 [ 375.052410][ T193] RAX: 6b6b6b6b6b6b6b6b RBX: ffff9f8fae6ea000 RCX: 0000000000000006 [ 375.055178][ T193] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f8fae6ea000 [ 375.057762][ T193] RBP: ffff9f8fae6ea000 R08: 0000000000000000 R09: 0000000000000000 [ 375.059810][ T193] R10: 0000000000000001 R11: 0000000000000000 R12: ffff9f8facfcfe08 [ 375.061892][ T193] R13: ffffffff883587e0 R14: 0000000000000000 R15: ffff9f8fae6ea580 [ 375.063931][ T193] FS: 0000000000000000(0000) GS:ffff9f8fbae00000(0000) knlGS:0000000000000000 [ 375.066239][ T193] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 375.067841][ T193] CR2: 00007f2f542167a0 CR3: 000000012cee6002 CR4: 00000000003606e0 [ 375.069657][ T193] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 375.071471][ T193] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 375.073269][ T193] Call Trace: [ 375.074005][ T193] linkwatch_do_dev+0x4d/0x50 [ 375.075052][ T193] __linkwatch_run_queue+0x10b/0x200 [ 375.076244][ T193] linkwatch_event+0x21/0x30 [ 375.077274][ T193] process_one_work+0x252/0x600 [ 375.078379][ T193] ? process_one_work+0x600/0x600 [ 375.079518][ T193] worker_thread+0x3c/0x380 [ 375.080534][ T193] ? process_one_work+0x600/0x600 [ 375.081668][ T193] kthread+0x139/0x150 [ 375.082567][ T193] ? kthread_park+0x90/0x90 [ 375.083567][ T193] ret_from_fork+0x22/0x30 Fixes: e826eafa65c6 ("bonding: Call netif_carrier_off after register_netdevice") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03bonding: Fix reference count leak in bond_sysfs_slave_add.Qiushi Wu
commit a068aab42258e25094bc2c159948d263ed7d7a77 upstream. kobject_init_and_add() takes reference even when it fails. If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Previous commit "b8eb718348b8" fixed a similar problem. Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-20bonding/alb: make sure arp header is pulled before accessing itEric Dumazet
commit b7469e83d2add567e4e0b063963db185f3167cea upstream. Similar to commit 38f88c454042 ("bonding/alb: properly access headers in bond_alb_xmit()"), we need to make sure arp header was pulled in skb->head before blindly accessing it in rlb_arp_xmit(). Remove arp_pkt() private helper, since it is more readable/obvious to have the following construct back to back : if (!pskb_network_may_pull(skb, sizeof(*arp))) return NULL; arp = (struct arp_pkt *)skb_network_header(skb); syzbot reported : BUG: KMSAN: uninit-value in bond_slave_has_mac_rx include/net/bonding.h:704 [inline] BUG: KMSAN: uninit-value in rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] BUG: KMSAN: uninit-value in bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 CPU: 0 PID: 12743 Comm: syz-executor.4 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 bond_slave_has_mac_rx include/net/bonding.h:704 [inline] rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline] bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477 __bond_start_xmit drivers/net/bonding/bond_main.c:4257 [inline] bond_start_xmit+0x85d/0x2f70 drivers/net/bonding/bond_main.c:4282 __netdev_start_xmit include/linux/netdevice.h:4524 [inline] netdev_start_xmit include/linux/netdevice.h:4538 [inline] xmit_one net/core/dev.c:3470 [inline] dev_hard_start_xmit+0x531/0xab0 net/core/dev.c:3486 __dev_queue_xmit+0x37de/0x4220 net/core/dev.c:4063 dev_queue_xmit+0x4b/0x60 net/core/dev.c:4096 packet_snd net/packet/af_packet.c:2967 [inline] packet_sendmsg+0x8347/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45c479 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fc77ffbbc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fc77ffbc6d4 RCX: 000000000045c479 RDX: 000000000000000e RSI: 00000000200004c0 RDI: 0000000000000003 RBP: 000000000076bf20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000a04 R14: 00000000004cc7b0 R15: 000000000076bf2c Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 packet_alloc_skb net/packet/af_packet.c:2815 [inline] packet_snd net/packet/af_packet.c:2910 [inline] packet_sendmsg+0x66a0/0x93b0 net/packet/af_packet.c:2992 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] __sys_sendto+0xc1b/0xc50 net/socket.c:1998 __do_sys_sendto net/socket.c:2010 [inline] __se_sys_sendto+0x107/0x130 net/socket.c:2006 __x64_sys_sendto+0x6e/0x90 net/socket.c:2006 do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-14bonding/alb: properly access headers in bond_alb_xmit()Eric Dumazet
[ Upstream commit 38f88c45404293bbc027b956def6c10cbd45c616 ] syzbot managed to send an IPX packet through bond_alb_xmit() and af_packet and triggered a use-after-free. First, bond_alb_xmit() was using ipx_hdr() helper to reach the IPX header, but ipx_hdr() was using the transport offset instead of the network offset. In the particular syzbot report transport offset was 0xFFFF This patch removes ipx_hdr() since it was only (mis)used from bonding. Then we need to make sure IPv4/IPv6/IPX headers are pulled in skb->head before dereferencing anything. BUG: KASAN: use-after-free in bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452 Read of size 2 at addr ffff8801ce56dfff by task syz-executor.2/18108 (if (ipx_hdr(skb)->ipx_checksum != IPX_NO_CHECKSUM) ...) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: [<ffffffff8441fc42>] __dump_stack lib/dump_stack.c:17 [inline] [<ffffffff8441fc42>] dump_stack+0x14d/0x20b lib/dump_stack.c:53 [<ffffffff81a7dec4>] print_address_description+0x6f/0x20b mm/kasan/report.c:282 [<ffffffff81a7e0ec>] kasan_report_error mm/kasan/report.c:380 [inline] [<ffffffff81a7e0ec>] kasan_report mm/kasan/report.c:438 [inline] [<ffffffff81a7e0ec>] kasan_report.cold+0x8c/0x2a0 mm/kasan/report.c:422 [<ffffffff81a7dc4f>] __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:469 [<ffffffff82c8c00a>] bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452 [<ffffffff82c60c74>] __bond_start_xmit drivers/net/bonding/bond_main.c:4199 [inline] [<ffffffff82c60c74>] bond_start_xmit+0x4f4/0x1570 drivers/net/bonding/bond_main.c:4224 [<ffffffff83baa558>] __netdev_start_xmit include/linux/netdevice.h:4525 [inline] [<ffffffff83baa558>] netdev_start_xmit include/linux/netdevice.h:4539 [inline] [<ffffffff83baa558>] xmit_one net/core/dev.c:3611 [inline] [<ffffffff83baa558>] dev_hard_start_xmit+0x168/0x910 net/core/dev.c:3627 [<ffffffff83bacf35>] __dev_queue_xmit+0x1f55/0x33b0 net/core/dev.c:4238 [<ffffffff83bae3a8>] dev_queue_xmit+0x18/0x20 net/core/dev.c:4278 [<ffffffff84339189>] packet_snd net/packet/af_packet.c:3226 [inline] [<ffffffff84339189>] packet_sendmsg+0x4919/0x70b0 net/packet/af_packet.c:3252 [<ffffffff83b1ac0c>] sock_sendmsg_nosec net/socket.c:673 [inline] [<ffffffff83b1ac0c>] sock_sendmsg+0x12c/0x160 net/socket.c:684 [<ffffffff83b1f5a2>] __sys_sendto+0x262/0x380 net/socket.c:1996 [<ffffffff83b1f700>] SYSC_sendto net/socket.c:2008 [inline] [<ffffffff83b1f700>] SyS_sendto+0x40/0x60 net/socket.c:2004 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04bonding: fix active-backup transition after link failureMahesh Bandewar
[ Upstream commit 5d485ed88d48f8101a2067348e267c0aaf4ed486 ] After the recent fix in commit 1899bb325149 ("bonding: fix state transition issue in link monitoring"), the active-backup mode with miimon initially come-up fine but after a link-failure, both members transition into backup state. Following steps to reproduce the scenario (eth1 and eth2 are the slaves of the bond): ip link set eth1 up ip link set eth2 down sleep 1 ip link set eth2 up ip link set eth1 down cat /sys/class/net/eth1/bonding_slave/state cat /sys/class/net/eth2/bonding_slave/state Fixes: 1899bb325149 ("bonding: fix state transition issue in link monitoring") CC: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12bonding: fix unexpected IFF_BONDING bit unsetTaehee Yoo
[ Upstream commit 65de65d9033750d2cf1b336c9d6e9da3a8b5cc6e ] The IFF_BONDING means bonding master or bonding slave device. ->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets IFF_BONDING flag. bond0<--bond1 Both bond0 and bond1 are bonding device and these should keep having IFF_BONDING flag until they are removed. But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine do not check whether the slave device is the bonding type or not. This patch adds the interface type check routine before removing IFF_BONDING flag. Test commands: ip link add bond0 type bond ip link add bond1 type bond ip link set bond1 master bond0 ip link set bond1 nomaster ip link del bond1 type bond ip link add bond1 type bond Splat looks like: [ 226.665555] proc_dir_entry 'bonding/bond1' already registered [ 226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0 [ 226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix [ 226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ #96 [ 226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 226.670652] RIP: 0010:proc_register+0x2a9/0x3e0 [ 226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e 0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00 [ 226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282 [ 226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2 [ 226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c [ 226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019 [ 226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008 [ 226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e [ 226.680758] FS: 00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000 [ 226.681886] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0 [ 226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 226.685681] Call Trace: [ 226.687089] proc_create_seq_private+0xb3/0xf0 [ 226.687778] bond_create_proc_entry+0x1b3/0x3f0 [bonding] [ 226.691458] bond_netdev_event+0x433/0x970 [bonding] [ 226.692139] ? __module_text_address+0x13/0x140 [ 226.692779] notifier_call_chain+0x90/0x160 [ 226.693401] register_netdevice+0x9b3/0xd80 [ 226.694010] ? alloc_netdev_mqs+0x854/0xc10 [ 226.694629] ? netdev_change_features+0xa0/0xa0 [ 226.695278] ? rtnl_create_link+0x2ed/0xad0 [ 226.695849] bond_newlink+0x2a/0x60 [bonding] [ 226.696422] __rtnl_newlink+0xb9f/0x11b0 [ 226.696968] ? rtnl_link_unregister+0x220/0x220 [ ... ] Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-12bonding: fix state transition issue in link monitoringJay Vosburgh
[ Upstream commit 1899bb325149e481de31a4f32b59ea6f24e176ea ] Since de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring"), the bonding driver has utilized two separate variables to indicate the next link state a particular slave should transition to. Each is used to communicate to a different portion of the link state change commit logic; one to the bond_miimon_commit function itself, and another to the state transition logic. Unfortunately, the two variables can become unsynchronized, resulting in incorrect link state transitions within bonding. This can cause slaves to become stuck in an incorrect link state until a subsequent carrier state transition. The issue occurs when a special case in bond_slave_netdev_event sets slave->link directly to BOND_LINK_FAIL. On the next pass through bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL case will set the proposed next state (link_new_state) to BOND_LINK_UP, but the new_link to BOND_LINK_DOWN. The setting of the final link state from new_link comes after that from link_new_state, and so the slave will end up incorrectly in _DOWN state. Resolve this by combining the two variables into one. Reported-by: Aleksei Zakharov <zakharov.a.g@yandex.ru> Reported-by: Sha Zhang <zhangsha.zhang@huawei.com> Cc: Mahesh Bandewar <maheshb@google.com> Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-11-06bonding: fix potential NULL deref in bond_update_slave_arrEric Dumazet
commit a7137534b597b7c303203e6bc3ed87e87a273bb8 upstream. syzbot got a NULL dereference in bond_update_slave_arr() [1], happening after a failure to allocate bond->slave_arr A workqueue (bond_slave_arr_handler) is supposed to retry the allocation later, but if the slave is removed before the workqueue had a chance to complete, bond->slave_arr can still be NULL. [1] Failed to build slave-array. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN PTI Modules linked in: Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:bond_update_slave_arr.cold+0xc6/0x198 drivers/net/bonding/bond_main.c:4039 RSP: 0018:ffff88018fe33678 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000290b000 RDX: 0000000000000000 RSI: ffffffff82b63037 RDI: ffff88019745ea20 RBP: ffff88018fe33760 R08: ffff880170754280 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff88019745ea00 R14: 0000000000000000 R15: ffff88018fe338b0 FS: 00007febd837d700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004540a0 CR3: 00000001c242e005 CR4: 00000000001626f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff82b5b45e>] __bond_release_one+0x43e/0x500 drivers/net/bonding/bond_main.c:1923 [<ffffffff82b5b966>] bond_release drivers/net/bonding/bond_main.c:2039 [inline] [<ffffffff82b5b966>] bond_do_ioctl+0x416/0x870 drivers/net/bonding/bond_main.c:3562 [<ffffffff83ae25f4>] dev_ifsioc+0x6f4/0x940 net/core/dev_ioctl.c:328 [<ffffffff83ae2e58>] dev_ioctl+0x1b8/0xc70 net/core/dev_ioctl.c:495 [<ffffffff83995ffd>] sock_do_ioctl+0x1bd/0x300 net/socket.c:1088 [<ffffffff83996a80>] sock_ioctl+0x300/0x5d0 net/socket.c:1196 [<ffffffff81b124db>] vfs_ioctl fs/ioctl.c:47 [inline] [<ffffffff81b124db>] file_ioctl fs/ioctl.c:501 [inline] [<ffffffff81b124db>] do_vfs_ioctl+0xacb/0x1300 fs/ioctl.c:688 [<ffffffff81b12dc6>] SYSC_ioctl fs/ioctl.c:705 [inline] [<ffffffff81b12dc6>] SyS_ioctl+0xb6/0xe0 fs/ioctl.c:696 [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305 [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-08-29bonding: Force slave speed check after link state recovery for 802.3adThomas Falcon
[ Upstream commit 12185dfe44360f814ac4ead9d22ad2af7511b2e9 ] The following scenario was encountered during testing of logical partition mobility on pseries partitions with bonded ibmvnic adapters in LACP mode. 1. Driver receives a signal that the device has been swapped, and it needs to reset to initialize the new device. 2. Driver reports loss of carrier and begins initialization. 3. Bonding driver receives NETDEV_CHANGE notifier and checks the slave's current speed and duplex settings. Because these are unknown at the time, the bond sets its link state to BOND_LINK_FAIL and handles the speed update, clearing AD_PORT_LACP_ENABLE. 4. Driver finishes recovery and reports that the carrier is on. 5. Bond receives a new notification and checks the speed again. The speeds are valid but miimon has not altered the link state yet. AD_PORT_LACP_ENABLE remains off. Because the slave's link state is still BOND_LINK_FAIL, no further port checks are made when it recovers. Though the slave devices are operational and have valid speed and duplex settings, the bond will not send LACPDU's. The simplest fix I can see is to force another speed check in bond_miimon_commit. This way the bond will update AD_PORT_LACP_ENABLE if needed when transitioning from BOND_LINK_FAIL to BOND_LINK_UP. CC: Jarod Wilson <jarod@redhat.com> CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-25bonding: Add vlan tx offload to hw_enc_featuresYueHaibing
[ Upstream commit d595b03de2cb0bdf9bcdf35ff27840cc3a37158f ] As commit 30d8177e8ac7 ("bonding: Always enable vlan tx offload") said, we should always enable bonding's vlan tx offload, pass the vlan packets to the slave devices with vlan tci, let them to handle vlan implementation. Now if encapsulation protocols like VXLAN is used, skb->encapsulation may be set, then the packet is passed to vlan device which based on bonding device. However in netif_skb_features(), the check of hw_enc_features: if (skb->encapsulation) features &= dev->hw_enc_features; clears NETIF_F_HW_VLAN_CTAG_TX/NETIF_F_HW_VLAN_STAG_TX. This results in same issue in commit 30d8177e8ac7 like this: vlan_dev_hard_start_xmit -->dev_queue_xmit -->validate_xmit_skb -->netif_skb_features //NETIF_F_HW_VLAN_CTAG_TX is cleared -->validate_xmit_vlan -->__vlan_hwaccel_push_inside //skb->tci is cleared ... --> bond_start_xmit --> bond_xmit_hash //BOND_XMIT_POLICY_ENCAP34 --> __skb_flow_dissect // nhoff point to IP header --> case htons(ETH_P_8021Q) // skb_vlan_tag_present is false, so vlan = __skb_header_pointer(skb, nhoff, sizeof(_vlan), //vlan point to ip header wrongly Fixes: b2a103e6d0af ("bonding: convert to ndo_fix_features") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-31bonding: validate ip header before check IPPROTO_IGMPCong Wang
[ Upstream commit 9d1bc24b52fb8c5d859f9a47084bf1179470e04c ] bond_xmit_roundrobin() checks for IGMP packets but it parses the IP header even before checking skb->protocol. We should validate the IP header with pskb_may_pull() before using iph->protocol. Reported-and-tested-by: syzbot+e5be16aa39ad6e755391@syzkaller.appspotmail.com Fixes: a2fd940f4cff ("bonding: fix broken multicast with round-robin mode") Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-07-03bonding: Always enable vlan tx offloadYueHaibing
[ Upstream commit 30d8177e8ac776d89d387fad547af6a0f599210e ] We build vlan on top of bonding interface, which vlan offload is off, bond mode is 802.3ad (LACP) and xmit_hash_policy is BOND_XMIT_POLICY_ENCAP34. Because vlan tx offload is off, vlan tci is cleared and skb push the vlan header in validate_xmit_vlan() while sending from vlan devices. Then in bond_xmit_hash, __skb_flow_dissect() fails to get information from protocol headers encapsulated within vlan, because 'nhoff' is points to IP header, so bond hashing is based on layer 2 info, which fails to distribute packets across slaves. This patch always enable bonding's vlan tx offload, pass the vlan packets to the slave devices with vlan tci, let them to handle vlan implementation. Fixes: 278339a42a1b ("bonding: propogate vlan_features to bonding master") Suggested-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-16bonding: fix arp_validate toggling in active-backup modeJarod Wilson
[ Upstream commit a9b8a2b39ce65df45687cf9ef648885c2a99fe75 ] There's currently a problem with toggling arp_validate on and off with an active-backup bond. At the moment, you can start up a bond, like so: modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1 ip link set bond0 down echo "ens4f0" > /sys/class/net/bond0/bonding/slaves echo "ens4f1" > /sys/class/net/bond0/bonding/slaves ip link set bond0 up ip addr add 192.168.1.2/24 dev bond0 Pings to 192.168.1.1 work just fine. Now turn on arp_validate: echo 1 > /sys/class/net/bond0/bonding/arp_validate Pings to 192.168.1.1 continue to work just fine. Now when you go to turn arp_validate off again, the link falls flat on it's face: echo 0 > /sys/class/net/bond0/bonding/arp_validate dmesg ... [133191.911987] bond0: Setting arp_validate to none (0) [133194.257793] bond0: bond_should_notify_peers: slave ens4f0 [133194.258031] bond0: link status definitely down for interface ens4f0, disabling it [133194.259000] bond0: making interface ens4f1 the new active one [133197.330130] bond0: link status definitely down for interface ens4f1, disabling it [133197.331191] bond0: now running without any active interface! The problem lies in bond_options.c, where passing in arp_validate=0 results in bond->recv_probe getting set to NULL. This flies directly in the face of commit 3fe68df97c7f, which says we need to set recv_probe = bond_arp_recv, even if we're not using arp_validate. Said commit fixed this in bond_option_arp_interval_set, but missed that we can get to that same state in bond_option_arp_validate_set as well. One solution would be to universally set recv_probe = bond_arp_recv here as well, but I don't think bond_option_arp_validate_set has any business touching recv_probe at all, and that should be left to the arp_interval code, so we can just make things much tidier here. Fixes: 3fe68df97c7f ("bonding: always set recv_probe to bond_arp_rcv in arp monitor") CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-08bonding: show full hw address in sysfs for slave entriesKonstantin Khorenko
[ Upstream commit 18bebc6dd3281955240062655a4df35eef2c46b3 ] Bond expects ethernet hwaddr for its slave, but it can be longer than 6 bytes - infiniband interface for example. # cat /sys/devices/<skipped>/net/ib0/address 80:00:02:08:fe:80:00:00:00:00:00:00:7c:fe:90:03:00:be:5d:e1 # cat /sys/devices/<skipped>/net/ib0/bonding_slave/perm_hwaddr 80:00:02:08:fe:80 So print full hwaddr in sysfs "bonding_slave/perm_hwaddr" as well. Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-27bonding: fix event handling for stacked bondsSabrina Dubroca
[ Upstream commit 92480b3977fd3884649d404cbbaf839b70035699 ] When a bond is enslaved to another bond, bond_netdev_event() only handles the event as if the bond is a master, and skips treating the bond as a slave. This leads to a refcount leak on the slave, since we don't remove the adjacency to its master and the master holds a reference on the slave. Reproducer: ip link add bondL type bond ip link add bondU type bond ip link set bondL master bondU ip link del bondL No "Fixes:" tag, this code is older than git history. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-19bonding: fix PACKET_ORIGDEV regressionMichal Soltys
[ Upstream commit 3c963a3306eada999be5ebf4f293dfa3d3945487 ] This patch fixes a subtle PACKET_ORIGDEV regression which was a side effect of fixes introduced by: 6a9e461f6fe4 bonding: pass link-local packets to bonding master also. ... to: b89f04c61efe bonding: deliver link-local packets with skb->dev set to link that packets arrived on While 6a9e461f6fe4 restored pre-b89f04c61efe presence of link-local packets on bonding masters (which is required e.g. by linux bridges participating in spanning tree or needed for lab-like setups created with group_fwd_mask) it also caused the originating device information to be lost due to cloning. Maciej Żenczykowski proposed another solution that doesn't require packet cloning and retains original device information - instead of returning RX_HANDLER_PASS for all link-local packets it's now limited only to packets from inactive slaves. At the same time, packets passed to bonding masters retain correct information about the originating device and PACKET_ORIGDEV can be used to determine it. This elegantly solves all issues so far: - link-local packets that were removed from bonding masters - LLDP daemons being forced to explicitly bind to slave interfaces - PACKET_ORIGDEV having no effect on bond interfaces Fixes: 6a9e461f6fe4 (bonding: pass link-local packets to bonding master also.) Reported-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-23bonding: update nest level on unlinkWillem de Bruijn
[ Upstream commit 001e465f09a18857443489a57e74314a3368c805 ] A network device stack with multiple layers of bonding devices can trigger a false positive lockdep warning. Adding lockdep nest levels fixes this. Update the level on both enslave and unlink, to avoid the following series of events .. ip netns add test ip netns exec test bash ip link set dev lo addr 00:11:22:33:44:55 ip link set dev lo down ip link add dev bond1 type bond ip link add dev bond2 type bond ip link set dev lo master bond1 ip link set dev bond1 master bond2 ip link set dev bond1 nomaster ip link set dev bond2 master bond1 .. from still generating a splat: [ 193.652127] ====================================================== [ 193.658231] WARNING: possible circular locking dependency detected [ 193.664350] 4.20.0 #8 Not tainted [ 193.668310] ------------------------------------------------------ [ 193.674417] ip/15577 is trying to acquire lock: [ 193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290 [ 193.687851] but task is already holding lock: [ 193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290 [..] [ 193.851092] lock_acquire+0xa7/0x190 [ 193.855138] _raw_spin_lock_nested+0x2d/0x40 [ 193.859878] bond_get_stats+0x58/0x290 [ 193.864093] dev_get_stats+0x5a/0xc0 [ 193.868140] bond_get_stats+0x105/0x290 [ 193.872444] dev_get_stats+0x5a/0xc0 [ 193.876493] rtnl_fill_stats+0x40/0x130 [ 193.880797] rtnl_fill_ifinfo+0x6c5/0xdc0 [ 193.885271] rtmsg_ifinfo_build_skb+0x86/0xe0 [ 193.890091] rtnetlink_event+0x5b/0xa0 [ 193.894320] raw_notifier_call_chain+0x43/0x60 [ 193.899225] netdev_change_features+0x50/0xa0 [ 193.904044] bond_compute_features.isra.46+0x1ab/0x270 [ 193.909640] bond_enslave+0x141d/0x15b0 [ 193.913946] do_set_master+0x89/0xa0 [ 193.918016] do_setlink+0x37c/0xda0 [ 193.921980] __rtnl_newlink+0x499/0x890 [ 193.926281] rtnl_newlink+0x48/0x70 [ 193.930238] rtnetlink_rcv_msg+0x171/0x4b0 [ 193.934801] netlink_rcv_skb+0xd1/0x110 [ 193.939103] rtnetlink_rcv+0x15/0x20 [ 193.943151] netlink_unicast+0x3b5/0x520 [ 193.947544] netlink_sendmsg+0x2fd/0x3f0 [ 193.951942] sock_sendmsg+0x38/0x50 [ 193.955899] ___sys_sendmsg+0x2ba/0x2d0 [ 193.960205] __x64_sys_sendmsg+0xad/0x100 [ 193.964687] do_syscall_64+0x5a/0x460 [ 193.968823] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 7e2556e40026 ("bonding: avoid lockdep confusion in bond_get_stats()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21bonding: fix 802.3ad state sent to partner when unbinding slaveToni Peltonen
[ Upstream commit 3b5b3a3331d141e8f2a7aaae3a94dfa1e61ecbe4 ] Previously when unbinding a slave the 802.3ad implementation only told partner that the port is not suitable for aggregation by setting the port aggregation state from aggregatable to individual. This is not enough. If the physical layer still stays up and we only unbinded this port from the bond there is nothing in the aggregation status alone to prevent the partner from sending traffic towards us. To ensure that the partner doesn't consider this port at all anymore we should also disable collecting and distributing to signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to ensure partner exits collecting + distributing state. I have tested this behaviour againts Arista EOS switches with mlx5 cards (physical link stays up even when interface is down) and simulated the same situation virtually Linux <-> Linux with two network namespaces running two veth device pairs. In both cases setting aggregation to individual doesn't alone prevent traffic from being to sent towards this port given that the link stays up in partners end. Partner still keeps it's end in collecting + distributing state and continues until timeout is reached. In most cases this means we are losing the traffic partner sends towards our port while we wait for timeout. This is most visible with slow periodic time (LACP rate slow). Other open source implementations like Open VSwitch and libreswitch, and vendor implementations like Arista EOS, seem to disable collecting + distributing to when doing similar port disabling/detaching/removing change. With this patch kernel implementation would behave the same way and ensure partner doesn't consider our actor viable anymore. Signed-off-by: Toni Peltonen <peltzi@peltzi.fi> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-11-04bonding: fix length of actor systemTobias Jungel
[ Upstream commit 414dd6fb9a1a1b59983aea7bf0f79f0085ecc5b8 ] The attribute IFLA_BOND_AD_ACTOR_SYSTEM is sent to user space having the length of sizeof(bond->params.ad_actor_system) which is 8 byte. This patch aligns the length to ETH_ALEN to have the same MAC address exposed as using sysfs. Fixes: f87fda00b6ed2 ("bonding: prevent out of bound accesses") Signed-off-by: Tobias Jungel <tobias.jungel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18bonding: fix warning messageMahesh Bandewar
[ Upstream commit 0f3b914c9cfcd7bbedd445dc4ac5dd999fa213c2 ] RX queue config for bonding master could be different from its slave device(s). With the commit 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also."), the packet is reinjected into stack with skb->dev as bonding master. This potentially triggers the message: "bondX received packet on queue Y, but number of RX queues is Z" whenever the queue that packet is received on is higher than the numrxqueues on bonding master (Y > Z). Fixes: 6a9e461f6fe4 ("bonding: pass link-local packets to bonding master also.") Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18bonding: pass link-local packets to bonding master also.Mahesh Bandewar
[ Upstream commit 6a9e461f6fe4434e6172304b69774daff9a3ac4c ] Commit b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") changed the behavior of how link-local-multicast packets are processed. The change in the behavior broke some legacy use cases where these packets are expected to arrive on bonding master device also. This patch passes the packet to the stack with the link it arrived on as well as passes to the bonding-master device to preserve the legacy use case. Fixes: b89f04c61efe ("bonding: deliver link-local packets with skb->dev set to link that packets arrived on") Reported-by: Michal Soltys <soltys@ziu.info> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-18bonding: avoid possible dead-lockMahesh Bandewar
[ Upstream commit d4859d749aa7090ffb743d15648adb962a1baeae ] Syzkaller reported this on a slightly older kernel but it's still applicable to the current kernel - ====================================================== WARNING: possible circular locking dependency detected 4.18.0-next-20180823+ #46 Not tainted ------------------------------------------------------ syz-executor4/26841 is trying to acquire lock: 00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652 but task is already holding lock: 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline] 00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (rtnl_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline] bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320 process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415 -> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}: process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129 worker_thread+0x189/0x13c0 kernel/workqueue.c:2296 kthread+0x35a/0x420 kernel/kthread.c:246 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415 -> #0 ((wq_completion)bond_dev->name){+.+.}: lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock((work_completion)(&(&nnw->work)->work)); lock(rtnl_mutex); lock((wq_completion)bond_dev->name); *** DEADLOCK *** 1 lock held by syz-executor4/26841: stack backtrace: CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222 check_prev_add kernel/locking/lockdep.c:1862 [inline] check_prevs_add kernel/locking/lockdep.c:1975 [inline] validate_chain kernel/locking/lockdep.c:2416 [inline] __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412 lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734 register_netdevice+0x337/0x1100 net/core/dev.c:8410 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline] netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115 __sys_sendmsg+0x11d/0x290 net/socket.c:2153 __do_sys_sendmsg net/socket.c:2162 [inline] __se_sys_sendmsg net/socket.c:2160 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457089 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089 RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003 RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001 Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-06bonding: avoid lockdep confusion in bond_get_stats()Eric Dumazet
[ Upstream commit 7e2556e40026a1b0c16f37446ab398d5a5a892e4 ] syzbot found that the following sequence produces a LOCKDEP splat [1] ip link add bond10 type bond ip link add bond11 type bond ip link set bond11 master bond10 To fix this, we can use the already provided nest_level. This patch also provides correct nesting for dev->addr_list_lock [1] WARNING: possible recursive locking detected 4.18.0-rc6+ #167 Not tainted -------------------------------------------- syz-executor751/4439 is trying to acquire lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 but task is already holding lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&bond->stats_lock)->rlock); lock(&(&bond->stats_lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by syz-executor751/4439: #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 #2: (____ptrval____) (rcu_read_lock){....}, at: bond_get_stats+0x0/0x560 include/linux/compiler.h:215 stack backtrace: CPU: 0 PID: 4439 Comm: syz-executor751 Not tainted 4.18.0-rc6+ #167 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1765 [inline] check_deadlock kernel/locking/lockdep.c:1809 [inline] validate_chain kernel/locking/lockdep.c:2405 [inline] __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 bond_get_stats+0x232/0x560 drivers/net/bonding/bond_main.c:3432 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 rtnl_fill_stats+0x4d/0xac0 net/core/rtnetlink.c:1169 rtnl_fill_ifinfo+0x1aa6/0x3fb0 net/core/rtnetlink.c:1611 rtmsg_ifinfo_build_skb+0xc8/0x190 net/core/rtnetlink.c:3268 rtmsg_ifinfo_event.part.30+0x45/0xe0 net/core/rtnetlink.c:3300 rtmsg_ifinfo_event net/core/rtnetlink.c:3297 [inline] rtnetlink_event+0x144/0x170 net/core/rtnetlink.c:4716 notifier_call_chain+0x180/0x390 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735 call_netdevice_notifiers net/core/dev.c:1753 [inline] netdev_features_change net/core/dev.c:1321 [inline] netdev_change_features+0xb3/0x110 net/core/dev.c:7759 bond_compute_features.isra.47+0x585/0xa50 drivers/net/bonding/bond_main.c:1120 bond_enslave+0x1b25/0x5da0 drivers/net/bonding/bond_main.c:1755 bond_do_ioctl+0x7cb/0xae0 drivers/net/bonding/bond_main.c:3528 dev_ifsioc+0x43c/0xb30 net/core/dev_ioctl.c:327 dev_ioctl+0x1b5/0xcc0 net/core/dev_ioctl.c:493 sock_do_ioctl+0x1d3/0x3e0 net/socket.c:992 sock_ioctl+0x30d/0x680 net/socket.c:1093 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440859 Code: e8 2c af 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc51a92878 EFLAGS: 00000213 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440859 RDX: 0000000020000040 RSI: 0000000000008990 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000022d5880 R11: 0000000000000213 R12: 0000000000007390 R13: 0000000000401db0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-28bonding: set default miimon value for non-arp modes if not setJarod Wilson
[ Upstream commit c1f897ce186a529a494441642125479d38727a3d ] For some time now, if you load the bonding driver and configure bond parameters via sysfs using minimal config options, such as specifying nothing but the mode, relying on defaults for everything else, modes that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all wind up with both arp_interval=0 (as it should be) and miimon=0, which means the miimon monitor thread never actually runs. This is particularly problematic for 802.3ad. For example, from an LNST recipe I've set up: $ modprobe bonding max_bonds=0" $ echo "+t_bond0" > /sys/class/net/bonding_masters" $ ip link set t_bond0 down" $ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode" $ ip link set ens1f1 down" $ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves" $ ip link set ens1f0 down" $ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves" $ ethtool -i t_bond0" $ ip link set ens1f1 up" $ ip link set ens1f0 up" $ ip link set t_bond0 up" $ ip addr add 192.168.9.1/24 dev t_bond0" $ ip addr add 2002::1/64 dev t_bond0" This bond comes up okay, but things look slightly suspect in /proc/net/bonding/t_bond0 output: $ grep -i mii /proc/net/bonding/t_bond0 MII Status: up MII Polling Interval (ms): 0 MII Status: up MII Status: up Now, pull a cable on one of the ports in the bond, then reconnect it, and you'll see: Slave Interface: ens1f0 MII Status: down Speed: 1000 Mbps Duplex: full I believe this became a major issue as of commit 4d2c0cda0744, which for 802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about relying on link monitoring via miimon to set it correctly, but since the miimon work queue never runs, the link just stays marked down. If we simply tweak bond_option_mode_set() slightly, we can check for the non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON, which gets things back in full working order. This problem exists as far back as 4.14, and might be worth fixing in all stable trees since, though the work-around is to simply specify an miimon value yourself. Reported-by: Bob Ball <ball@umich.edu> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-06-26bonding: re-evaluate force_primary when the primary slave name changesXiangning Yu
[ Upstream commit eb55bbf865d9979098c6a7a17cbdb41237ece951 ] There is a timing issue under active-standy mode, when bond_enslave() is called, bond->params.primary might not be initialized yet. Any time the primary slave string changes, bond->force_primary should be set to true to make sure the primary becomes the active slave. Signed-off-by: Xiangning Yu <yuxiangning@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30bonding: fix the err path for dev hwaddr sync in bond_enslaveXin Long
[ Upstream commit 5c78f6bfae2b10ff70e21d343e64584ea6280c26 ] vlan_vids_add_by_dev is called right after dev hwaddr sync, so on the err path it should unsync dev hwaddr. Otherwise, the slave dev's hwaddr will never be unsync when this err happens. Fixes: 1ff412ad7714 ("bonding: change the bond's vlan syncing functions with the standard ones") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-19bonding: send learning packets for vlans on slaveDebabrata Banerjee
[ Upstream commit 21706ee8a47d3ede7fdae0be6d7c0a0e31a83229 ] There was a regression at some point from the intended functionality of commit f60c3704e87d ("bonding: Fix alb mode to only use first level vlans.") Given the return value vlan_get_encap_level() we need to store the nest level of the bond device, and then compare the vlan's encap level to this. Without this, this check always fails and learning packets are never sent. In addition, this same commit caused a regression in the behavior of balance_alb, which requires learning packets be sent for all interfaces using the slave's mac in order to load balance properly. For vlan's that have not set a user mac, we can send after checking one bit. Otherwise we need send the set mac, albeit defeating rx load balancing for that vlan. Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-19bonding: do not allow rlb updates to invalid macDebabrata Banerjee
[ Upstream commit 4fa8667ca3989ce14cf66301fa251544fbddbdd0 ] Make sure multicast, broadcast, and zero mac's cannot be the output of rlb updates, which should all be directed arps. Receive load balancing will be collapsed if any of these happen, as the switch will broadcast. Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-29bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslaveXin Long
[ Upstream commit ddea788c63094f7c483783265563dd5b50052e28 ] After Commit 8a8efa22f51b ("bonding: sync netpoll code with bridge"), it would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev if bond->dev->npinfo was set. However now slave_dev npinfo is set with bond->dev->npinfo before calling slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup(). It causes that the lower dev of this slave dev can't set its npinfo. One way to reproduce it: # modprobe bonding # brctl addbr br0 # brctl addif br0 eth1 # ifconfig bond0 192.168.122.1/24 up # ifenslave bond0 eth2 # systemctl restart netconsole # ifenslave bond0 br0 # ifconfig eth2 down # systemctl restart netconsole The netpoll won't really work. This patch is to remove that slave_dev npinfo setting in bond_enslave(). Fixes: 8a8efa22f51b ("bonding: sync netpoll code with bridge") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12bonding: process the err returned by dev_set_allmulti properly in bond_enslaveXin Long
[ Upstream commit 9f5a90c107741b864398f4ac0014711a8c1d8474 ] When dev_set_promiscuity(1) succeeds but dev_set_allmulti(1) fails, dev_set_promiscuity(-1) should be done before going to the err path. Otherwise, dev->promiscuity will leak. Fixes: 7e1a1ac1fbaa ("bonding: Check return of dev_set_promiscuity/allmulti") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12bonding: move dev_mc_sync after master_upper_dev_link in bond_enslaveXin Long
[ Upstream commit ae42cc62a9f07f1f6979054ed92606b9c30f4a2e ] Beniamino found a crash when adding vlan as slave of bond which is also the parent link: ip link add bond1 type bond ip link set bond1 up ip link add link bond1 vlan1 type vlan id 80 ip link set vlan1 master bond1 The call trace is as below: [<ffffffffa850842a>] queued_spin_lock_slowpath+0xb/0xf [<ffffffffa8515680>] _raw_spin_lock+0x20/0x30 [<ffffffffa83f6f07>] dev_mc_sync+0x37/0x80 [<ffffffffc08687dc>] vlan_dev_set_rx_mode+0x1c/0x30 [8021q] [<ffffffffa83efd2a>] __dev_set_rx_mode+0x5a/0xa0 [<ffffffffa83f7138>] dev_mc_sync_multiple+0x78/0x80 [<ffffffffc084127c>] bond_enslave+0x67c/0x1190 [bonding] [<ffffffffa8401909>] do_setlink+0x9c9/0xe50 [<ffffffffa8403bf2>] rtnl_newlink+0x522/0x880 [<ffffffffa8403ff7>] rtnetlink_rcv_msg+0xa7/0x260 [<ffffffffa8424ecb>] netlink_rcv_skb+0xab/0xc0 [<ffffffffa83fe498>] rtnetlink_rcv+0x28/0x30 [<ffffffffa8424850>] netlink_unicast+0x170/0x210 [<ffffffffa8424bf8>] netlink_sendmsg+0x308/0x420 [<ffffffffa83cc396>] sock_sendmsg+0xb6/0xf0 This is actually a dead lock caused by sync slave hwaddr from master when the master is the slave's 'slave'. This dead loop check is actually done by netdev_master_upper_dev_link. However, Commit 1f718f0f4f97 ("bonding: populate neighbour's private on enslave") moved it after dev_mc_sync. This patch is to fix it by moving dev_mc_sync after master_upper_dev_link, so that this loop check would be earlier than dev_mc_sync. It also moves if (mode == BOND_MODE_8023AD) into if (!bond_uses_primary) clause as an improvement. Note team driver also has this issue, I will fix it in another patch. Fixes: 1f718f0f4f97 ("bonding: populate neighbour's private on enslave") Reported-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-12bonding: fix the err path for dev hwaddr sync in bond_enslaveXin Long
[ Upstream commit 5c78f6bfae2b10ff70e21d343e64584ea6280c26 ] vlan_vids_add_by_dev is called right after dev hwaddr sync, so on the err path it should unsync dev hwaddr. Otherwise, the slave dev's hwaddr will never be unsync when this err happens. Fixes: 1ff412ad7714 ("bonding: change the bond's vlan syncing functions with the standard ones") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-08bonding: fix slave stuck in BOND_LINK_FAIL stateJay Vosburgh
The bonding miimon logic has a flaw, in that a failure of the rtnl_trylock can cause a slave to become permanently stuck in BOND_LINK_FAIL state. The sequence of events to cause this is as follows: 1) bond_miimon_inspect finds that a slave's link is down, and so calls bond_propose_link_state, setting slave->new_link_state to BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns non-zero. 2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is rescheduled. No change is committed. 3) bond_miimon_inspect is called again, but this time the slave from step 1 has recovered. slave->new_link is reset to NOCHANGE, and, as slave->link was never changed, the switch enters the BOND_LINK_UP case, and does nothing. The pending BOND_LINK_FAIL state from step 1 remains pending, as new_link_state is not reset. 4) The state from step 3 persists until another slave changes link state and causes bond_miimon_inspect to return non-zero. At this point, the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed, and the slave will remain stuck in BOND_LINK_FAIL state even though it is actually link up. The remedy for this is to initialize new_link_state on each entry to bond_miimon_inspect, as is already done with new_link. Fixes: fb9eb899a6dc ("bonding: handle link transition from FAIL to UP correctly") Reported-by: Alex Sidorenko <alexandre.sidorenko@hpe.com> Reviewed-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-08bonding: discard lowest hash bit for 802.3ad layer3+4Hangbin Liu
After commit 07f4c90062f8 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), we will try to use even ports for connect(). Then if an application (seen clearly with iperf) opens multiple streams to the same destination IP and port, each stream will be given an even source port. So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing will always hash all these streams to the same interface. And the total throughput will limited to a single slave. Change the tcp code will impact the whole tcp behavior, only for bonding usage. Paolo Abeni suggested fix this by changing the bonding code only, which should be more reasonable, and less impact. Fix this by discarding the lowest hash bit because it contains little entropy. After the fix we can re-balance between slaves. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-12net: bonding: fix tlb_dynamic_lb default valueNikolay Aleksandrov
Commit 8b426dc54cf4 ("bonding: remove hardcoded value") changed the default value for tlb_dynamic_lb which lead to either broken ALB mode (since tlb_dynamic_lb can be changed only in TLB) or setting TLB mode with tlb_dynamic_lb equal to 0. The first issue was recently fixed by setting tlb_dynamic_lb to 1 always when switching to ALB mode, but the default value is still wrong and we'll enter TLB mode with tlb_dynamic_lb equal to 0 if the mode is changed via netlink or sysfs. In order to restore the previous behaviour and default value simply remove the mode check around the default param initialization for tlb_dynamic_lb which will always set it to 1 as before. Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-11net: bonding: Fix transmit load balancing in balance-alb mode if specified ↵Kosuke Tatsukawa
by sysfs Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in balance-alb mode") tried to fix transmit dynamic load balancing in balance-alb mode, which wasn't working after commit 8b426dc54cf4 ("bonding: remove hardcoded value"). It turned out that my previous patch only fixed the case when balance-alb was specified as bonding module parameter, and not when balance-alb mode was set using /sys/class/net/*/bonding/mode (the most common usage). In the latter case, tlb_dynamic_lb was set up according to the default mode of the bonding interface, which happens to be balance-rr. This additional patch addresses this issue by setting up tlb_dynamic_lb to 1 if "mode" is set to balance-alb through the sysfs interface. I didn't add code to change tlb_balance_lb back to the default value for other modes, because "mode" is usually set up only once during initialization, and it's not worthwhile to change the static variable bonding_defaults in bond_main.c to a global variable just for this purpose. Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for balance-tlb mode if it is set up using the sysfs interface. I didn't change that behavior, because the value of tlb_balance_lb can be changed using the sysfs interface for balance-tlb, and I didn't like changing the default value back and forth for balance-tlb. As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be written to. However, I think balance-alb with tlb_dynamic_lb set to 0 is not an intended usage, so there is little use making it writable at this moment. Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value") Reported-by: Reinis Rozitis <r@roze.lv> Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Cc: stable@vger.kernel.org # v4.12+ Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-08-13bonding: ratelimit failed speed/duplex update warningAndreas Born
bond_miimon_commit() handles the UP transition for each slave of a bond in the case of MII. It is triggered 10 times per second for the default MII Polling interval of 100ms. For device drivers that do not implement __ethtool_get_link_ksettings() the call to bond_update_speed_duplex() fails persistently while the MII status could remain UP. That is, in this and other cases where the speed/duplex update keeps failing over a longer period of time while the MII state is UP, a warning is printed every MII polling interval. To address these excessive warnings net_ratelimit() should be used. Printing a warning once would not be sufficient since the call to bond_update_speed_duplex() could recover to succeed and fail again later. In that case there would be no new indication what went wrong. Fixes: b5bf0f5b16b9c (bonding: correctly update link status during mii-commit phase) Signed-off-by: Andreas Born <futur.andy@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-11bonding: require speed/duplex only for 802.3ad, alb and tlbAndreas Born
The patch c4adfc822bf5 ("bonding: make speed, duplex setting consistent with link state") puts the link state to down if bond_update_speed_duplex() cannot retrieve speed and duplex settings. Assumably the patch was written with 802.3ad mode in mind which relies on link speed/duplex settings. For other modes like active-backup these settings are not required. Thus, only for these other modes, this patch reintroduces support for slaves that do not support reporting speed or duplex such as wireless devices. This fixes the regression reported in bug 196547 (https://bugzilla.kernel.org/show_bug.cgi?id=196547). Fixes: c4adfc822bf5 ("bonding: make speed, duplex setting consistent with link state") Signed-off-by: Andreas Born <futur.andy@googlemail.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two minor conflicts in virtio_net driver (bug fix overlapping addition of a helper) and MAINTAINERS (new driver edit overlapping revamp of PHY entry). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26bonding: commit link status change after proposeWANG Cong
Commit de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") moves link status commitment into bond_mii_monitor(), but it still relies on the return value of bond_miimon_inspect() as the hint. We need to return non-zero as long as we propose a link status change. Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>