summaryrefslogtreecommitdiffstats
path: root/net/core/rtnetlink.c
AgeCommit message (Collapse)Author
2015-03-18net: do not use rcu in rtnl_dump_ifinfo()Eric Dumazet
[ Upstream commit cac5e65e8a7ea074f2626d2eaa53aa308452dec4 ] We did a failed attempt in the past to only use rcu in rtnl dump operations (commit e67f88dd12f6 "net: dont hold rtnl mutex during netlink dump callbacks") Now that dumps are holding RTNL anyway, there is no need to also use rcu locking, as it forbids any scheduling ability, like GFP_KERNEL allocations that controlling path should use instead of GFP_ATOMIC whenever possible. This should fix following splat Cong Wang reported : [ INFO: suspicious RCU usage. ] 3.19.0+ #805 Tainted: G W include/linux/rcupdate.h:538 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ip/771: #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff8182b8f4>] netlink_dump+0x21/0x26c #1: (rcu_read_lock){......}, at: [<ffffffff817d785b>] rcu_read_lock+0x0/0x6e stack backtrace: CPU: 3 PID: 771 Comm: ip Tainted: G W 3.19.0+ #805 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 0000000000000001 ffff8800d51e7718 ffffffff81a27457 0000000029e729e6 ffff8800d6108000 ffff8800d51e7748 ffffffff810b539b ffffffff820013dd 00000000000001c8 0000000000000000 ffff8800d7448088 ffff8800d51e7758 Call Trace: [<ffffffff81a27457>] dump_stack+0x4c/0x65 [<ffffffff810b539b>] lockdep_rcu_suspicious+0x107/0x110 [<ffffffff8109796f>] rcu_preempt_sleep_check+0x45/0x47 [<ffffffff8109e457>] ___might_sleep+0x1d/0x1cb [<ffffffff8109e67d>] __might_sleep+0x78/0x80 [<ffffffff814b9b1f>] idr_alloc+0x45/0xd1 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff814b9f9d>] ? idr_for_each+0x53/0x101 [<ffffffff817c1383>] alloc_netid+0x61/0x69 [<ffffffff817c14c3>] __peernet2id+0x79/0x8d [<ffffffff817c1ab7>] peernet2id+0x13/0x1f [<ffffffff817d8673>] rtnl_fill_ifinfo+0xa8d/0xc20 [<ffffffff810b17d9>] ? __lock_is_held+0x39/0x52 [<ffffffff817d894f>] rtnl_dump_ifinfo+0x149/0x213 [<ffffffff8182b9c2>] netlink_dump+0xef/0x26c [<ffffffff8182bcba>] netlink_recvmsg+0x17b/0x2c5 [<ffffffff817b0adc>] __sock_recvmsg+0x4e/0x59 [<ffffffff817b1b40>] sock_recvmsg+0x3f/0x51 [<ffffffff817b1f9a>] ___sys_recvmsg+0xf6/0x1d9 [<ffffffff8115dc67>] ? handle_pte_fault+0x6e1/0xd3d [<ffffffff8100a3a0>] ? native_sched_clock+0x35/0x37 [<ffffffff8109f45b>] ? sched_clock_local+0x12/0x72 [<ffffffff8109f6ac>] ? sched_clock_cpu+0x9e/0xb7 [<ffffffff810cb7ab>] ? rcu_read_lock_held+0x3b/0x3d [<ffffffff811abde8>] ? __fcheck_files+0x4c/0x58 [<ffffffff811ac556>] ? __fget_light+0x2d/0x52 [<ffffffff817b376f>] __sys_recvmsg+0x42/0x60 [<ffffffff817b379f>] SyS_recvmsg+0x12/0x1c Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 0c7aecd4bde4b7302 ("netns: add rtnl cmd to add and get peer netns ids") Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18rtnetlink: call ->dellink on failure when ->newlink existsWANG Cong
[ Upstream commit 7afb8886a05be68e376655539a064ec672de8a8e ] Ignacy reported that when eth0 is down and add a vlan device on top of it like: ip link add link eth0 name eth0.1 up type vlan id 1 We will get a refcount leak: unregister_netdevice: waiting for eth0.1 to become free. Usage count = 2 The problem is when rtnl_configure_link() fails in rtnl_newlink(), we simply call unregister_device(), but for stacked device like vlan, we almost do nothing when we unregister the upper device, more work is done when we unregister the lower device, so call its ->dellink(). Reported-by: Ignacy Gawedzki <ignacy.gawedzki@green-communications.fr> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-03-18rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARYDaniel Borkmann
[ Upstream commit 364d5716a7adb91b731a35765d369602d68d2881 ] ifla_vf_policy[] is wrong in advertising its individual member types as NLA_BINARY since .type = NLA_BINARY in combination with .len declares the len member as *max* attribute length [0, len]. The issue is that when do_setvfinfo() is being called to set up a VF through ndo handler, we could set corrupted data if the attribute length is less than the size of the related structure itself. The intent is exactly the opposite, namely to make sure to pass at least data of minimum size of len. Fixes: ebc08a6f47ee ("rtnetlink: Add VF config code to rtnetlink") Cc: Mitch Williams <mitch.a.williams@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-01-28bridge: dont send notification when skb->len == 0 in rtnl_bridge_notifyRoopa Prabhu
Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081 This patch avoids calling rtnl_notify if the device ndo_bridge_getlink handler does not return any bytes in the skb. Alternately, the skb->len check can be moved inside rtnl_notify. For the bridge vlan case described in 92081, there is also a fix needed in bridge driver to generate a proper notification. Will fix that in subsequent patch. v2: rebase patch on net tree Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-16net: Disallow providing non zero VLAN ID for NIC drivers FDB add flowOr Gerlitz
The current implementations all use dev_uc_add_excl() and such whose API doesn't support vlans, so we can't make it with NICs HW for now. Fixes: f6f6424ba773 ('net: make vid as a parameter for ndo_fdb_add/ndo_fdb_del') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
2014-12-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/ethernet/amd/xgbe/xgbe-desc.c drivers/net/ethernet/renesas/sh_eth.c Overlapping changes in both conflict cases. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Merge branch 'sched-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle are: - 'Nested Sleep Debugging', activated when CONFIG_DEBUG_ATOMIC_SLEEP=y. This instruments might_sleep() checks to catch places that nest blocking primitives - such as mutex usage in a wait loop. Such bugs can result in hard to debug races/hangs. Another category of invalid nesting that this facility will detect is the calling of blocking functions from within schedule() -> sched_submit_work() -> blk_schedule_flush_plug(). There's some potential for false positives (if secondary blocking primitives themselves are not ready yet for this facility), but the kernel will warn once about such bugs per bootup, so the warning isn't much of a nuisance. This feature comes with a number of fixes, for problems uncovered with it, so no messages are expected normally. - Another round of sched/numa optimizations and refinements, for CONFIG_NUMA_BALANCING=y. - Another round of sched/dl fixes and refinements. Plus various smaller fixes and cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) sched: Add missing rcu protection to wake_up_all_idle_cpus sched/deadline: Introduce start_hrtick_dl() for !CONFIG_SCHED_HRTICK sched/numa: Init numa balancing fields of init_task sched/deadline: Remove unnecessary definitions in cpudeadline.h sched/cpupri: Remove unnecessary definitions in cpupri.h sched/deadline: Fix rq->dl.pushable_tasks bug in push_dl_task() sched/fair: Fix stale overloaded status in the busiest group finding logic sched: Move p->nr_cpus_allowed check to select_task_rq() sched/completion: Document when to use wait_for_completion_io_*() sched: Update comments about CLONE_NEWUTS and CLONE_NEWIPC sched/fair: Kill task_struct::numa_entry and numa_group::task_list sched: Refactor task_struct to use numa_faults instead of numa_* pointers sched/deadline: Don't check CONFIG_SMP in switched_from_dl() sched/deadline: Reschedule from switched_from_dl() after a successful pull sched/deadline: Push task away if the deadline is equal to curr during wakeup sched/deadline: Add deadline rq status print sched/deadline: Fix artificial overrun introduced by yield_task_dl() sched/rt: Clean up check_preempt_equal_prio() sched/core: Use dl_bw_of() under rcu_read_lock_sched() sched: Check if we got a shallowest_idle_cpu before searching for least_loaded_cpu ...
2014-12-09rocker: remove swdev modeRoopa Prabhu
Remove use of 'swdev' mode in rocker. rocker dev offloads can use the BRIDGE_FLAGS_SELF to indicate offload to hardware. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()Mahesh Bandewar
The commit 56bfa7ee7c ("unregister_netdevice : move RTM_DELLINK to until after ndo_uninit") tried to do this ealier but while doing so it created a problem. Unfortunately the delayed rtmsg_ifinfo() also delayed call to fill_info(). So this translated into asking driver to remove private state and then query it's private state. This could have catastropic consequences. This change breaks the rtmsg_ifinfo() into two parts - one takes the precise snapshot of the device by called fill_info() before calling the ndo_uninit() and the second part sends the notification using collected snapshot. It was brought to notice when last link is deleted from an ipvlan device when it has free-ed the port and the subsequent .fill_info() call is trying to get the info from the port. kernel: [ 255.139429] ------------[ cut here ]------------ kernel: [ 255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110() kernel: [ 255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c kernel: [ 255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167 kernel: [ 255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012 kernel: [ 255.139515] 0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0 kernel: [ 255.139516] 0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000 kernel: [ 255.139518] 00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011 kernel: [ 255.139520] Call Trace: kernel: [ 255.139527] [<ffffffff815d87f4>] dump_stack+0x46/0x58 kernel: [ 255.139531] [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0 kernel: [ 255.139540] [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20 kernel: [ 255.139544] [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110 kernel: [ 255.139547] [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0 kernel: [ 255.139549] [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0 kernel: [ 255.139551] [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110 kernel: [ 255.139553] [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240 kernel: [ 255.139557] [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80 kernel: [ 255.139558] [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20 kernel: [ 255.139562] [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0 kernel: [ 255.139563] [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40 kernel: [ 255.139565] [<ffffffff8152c398>] netlink_unicast+0x178/0x230 kernel: [ 255.139567] [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420 kernel: [ 255.139571] [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0 kernel: [ 255.139575] [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130 kernel: [ 255.139577] [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0 kernel: [ 255.139578] [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310 kernel: [ 255.139581] [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0 kernel: [ 255.139585] [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70 kernel: [ 255.139589] [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500 kernel: [ 255.139597] [<ffffffff811e8336>] ? dput+0xb6/0x190 kernel: [ 255.139606] [<ffffffff811f05f6>] ? mntput+0x26/0x40 kernel: [ 255.139611] [<ffffffff811d2b94>] ? __fput+0x174/0x1e0 kernel: [ 255.139613] [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90 kernel: [ 255.139615] [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20 kernel: [ 255.139617] [<ffffffff815df092>] system_call_fastpath+0x12/0x17 kernel: [ 255.139619] ---[ end trace 5e6703e87d984f6b ]--- Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Cc: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02bridge: add brport flags to dflt bridge_getlinkScott Feldman
To allow brport device to return current brport flags set on port. Add returned flags to nested IFLA_PROTINFO netlink msg built in dflt getlink. With this change, netlink msg returned for bridge_getlink contains the port's offloaded flag settings (the port's SELF settings). Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02rtnl: expose physical switch id for particular deviceJiri Pirko
The netdevice represents a port in a switch, it will expose IFLA_PHYS_SWITCH_ID value via rtnl. Two netdevices with the same value belong to one physical switch. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02net: rename netdev_phys_port_id to more generic nameJiri Pirko
So this can be reused for identification of other "items" as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02net: make vid as a parameter for ndo_fdb_add/ndo_fdb_delJiri Pirko
Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple u16 vid to drivers from there. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-29rtnetlink: release net refcnt on error in do_setlink()Nicolas Dichtel
rtnl_link_get_net() holds a reference on the 'struct net', we need to release it in case of error. CC: Eric W. Biederman <ebiederm@xmission.com> Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINKThomas Graf
Only search for IFLA_EXT_MASK if the message actually carries a ifinfomsg header and validate minimal length requirements for IFLA_EXT_MASK. Fixes: 6cbdceeb ("bridge: Dump vlan information from a bridge port") Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-26bridge: Validate IFLA_BRIDGE_FLAGS attribute lengthThomas Graf
Payload is currently accessed blindly and may exceed valid message boundaries. Fixes: 407af3299 ("bridge: Add netlink interface to configure vlans on bridge ports") Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-04netdev, sched/wait: Fix sleeping inside wait eventPeter Zijlstra
rtnl_lock_unregistering*() take rtnl_lock() -- a mutex -- inside a wait loop. The wait loop relies on current->state to function, but so does mutex_lock(), nesting them makes for the inner to destroy the outer state. Fix this using the new wait_woken() bits. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Cong Wang <cwang@twopensource.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jerry Chu <hkchu@google.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Cc: sfeldma@cumulusnetworks.com <sfeldma@cumulusnetworks.com> Cc: stephen hemminger <stephen@networkplumber.org> Cc: Tom Gundersen <teg@jklm.no> Cc: Tom Herbert <therbert@google.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20141029173110.GE15602@worktop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-02rtnl/do_setlink(): notify when a netdev is modifiedNicolas Dichtel
Depending on which parameters were updated, the changes were not propagated via the notifier chain and netlink. The new flag has been set only when the change did not cause a call to the notifier chain and/or to the netlink notification functions. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02rtnl/do_setlink(): last arg is now a set of flagsNicolas Dichtel
There is no functional changes with this commit, it only prepares the next one. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02rtnl/do_setlink(): set modified when IFLA_LINKMODE is updatedNicolas Dichtel
The only effect of this patch is to print a warning if IFLA_LINKMODE is updated and a following change fails. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-02rtnl/do_setlink(): set modified when IFLA_TXQLEN is updatedNicolas Dichtel
The only effect of this patch is to print a warning if IFLA_TXQLEN is updated and a following change fails. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-08rtnetlink: fix VF info sizeJiri Benc
Commit 1d8faf48c74b8 ("net/core: Add VF link state control") added new attribute to IFLA_VF_INFO group in rtnl_fill_ifinfo but did not adjust size of the allocated memory in if_nlmsg_size/rtnl_vfinfo_size. As the result, we may trigger warnings in rtnl_getlink and similar functions when many VF links are enabled, as the information does not fit into the allocated skb. Fixes: 1d8faf48c74b8 ("net/core: Add VF link state control") Reported-by: Yulong Pei <ypei@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16rtnetlink: Drop unnecessary return value from ndo_dflt_fdb_delAlexander Duyck
This change cleans up ndo_dflt_fdb_del to drop the ENOTSUPP return value since that isn't actually returned anywhere in the code. As a result we are able to drop a few lines by just defaulting this to -EINVAL. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15net: rtnetlink - make create_link take name_assign_typeTom Gundersen
This passes down NET_NAME_USER (or NET_NAME_ENUM) to alloc_netdev(), for any device created over rtnetlink. v9: restore reverse-christmas-tree order of local variables Signed-off-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-15net: set name_assign_type in alloc_netdev()Tom Gundersen
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert all users to pass NET_NAME_UNKNOWN. Coccinelle patch: @@ expression sizeof_priv, name, setup, txqs, rxqs, count; @@ ( -alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs) +alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs) | -alloc_netdev_mq(sizeof_priv, name, setup, count) +alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count) | -alloc_netdev(sizeof_priv, name, setup) +alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup) ) v9: move comments here from the wrong commit Signed-off-by: Tom Gundersen <teg@jklm.no> Reviewed-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-10bridge: netlink dump interface at par with brctlJamal Hadi Salim
Actually better than brctl showmacs because we can filter by bridge port in the kernel. The current bridge netlink interface doesnt scale when you have many bridges each with large fdbs or even bridges with many bridge ports And now for the science non-fiction novel you have all been waiting for.. //lets see what bridge ports we have root@moja-1:/configs/may30-iprt/bridge# ./bridge link show 8: eth1 state DOWN : <BROADCAST,MULTICAST> mtu 1500 master br0 state disabled priority 32 cost 19 17: sw1-p1 state DOWN : <BROADCAST,NOARP> mtu 1500 master br0 state disabled priority 32 cost 100 // show all.. root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show 33:33:00:00:00:01 dev bond0 self permanent 33:33:00:00:00:01 dev dummy0 self permanent 33:33:00:00:00:01 dev ifb0 self permanent 33:33:00:00:00:01 dev ifb1 self permanent 33:33:00:00:00:01 dev eth0 self permanent 01:00:5e:00:00:01 dev eth0 self permanent 33:33:ff:22:01:01 dev eth0 self permanent 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:07 dev eth1 self permanent 33:33:00:00:00:01 dev eth1 self permanent 33:33:00:00:00:01 dev gretap0 self permanent da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent 33:33:00:00:00:01 dev sw1-p1 self permanent //filter by bridge root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:07 dev eth1 self permanent 33:33:00:00:00:01 dev eth1 self permanent da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent 33:33:00:00:00:01 dev sw1-p1 self permanent // bridge sw1 has no ports attached.. root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br sw1 //filter by port root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show brport eth1 02:00:00:12:01:02 vlan 0 master br0 permanent 00:17:42:8a:b4:05 vlan 0 master br0 permanent 00:17:42:8a:b4:07 self permanent 33:33:00:00:00:01 self permanent // filter by port + bridge root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0 brport sw1-p1 da:ac:46:27:d9:53 vlan 0 master br0 permanent 33:33:00:00:00:01 self permanent // for shits and giggles (as they say in New Brunswick), lets // change the mac that br0 uses // Note: a magical fdb entry with no brport is added ... root@moja-1:/configs/may30-iprt/bridge# ip link set dev br0 address 02:00:00:12:01:04 // lets see if we can see the unicorn .. root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show 33:33:00:00:00:01 dev bond0 self permanent 33:33:00:00:00:01 dev dummy0 self permanent 33:33:00:00:00:01 dev ifb0 self permanent 33:33:00:00:00:01 dev ifb1 self permanent 33:33:00:00:00:01 dev eth0 self permanent 01:00:5e:00:00:01 dev eth0 self permanent 33:33:ff:22:01:01 dev eth0 self permanent 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:07 dev eth1 self permanent 33:33:00:00:00:01 dev eth1 self permanent 33:33:00:00:00:01 dev gretap0 self permanent 02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent 33:33:00:00:00:01 dev sw1-p1 self permanent //can we see it if we filter by bridge? root@moja-1:/configs/may30-iprt/bridge# ./bridge fdb show br br0 02:00:00:12:01:02 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:05 dev eth1 vlan 0 master br0 permanent 00:17:42:8a:b4:07 dev eth1 self permanent 33:33:00:00:00:01 dev eth1 self permanent 02:00:00:12:01:04 dev br0 vlan 0 master br0 permanent <=== there it is da:ac:46:27:d9:53 dev sw1-p1 vlan 0 master br0 permanent 33:33:00:00:00:01 dev sw1-p1 self permanent Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-10bridge: fdb dumping takes a filter deviceJamal Hadi Salim
Dumping a bridge fdb dumps every fdb entry held. With this change we are going to filter on selected bridge port. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-01rtnetlink: allow to register ops without ops->setup setJiri Pirko
So far, it is assumed that ops->setup is filled up. But there might be case that ops might make sense even without ->setup. In that case, forbid to newlink and dellink. This allows to register simple rtnl link ops containing only ->kind. That allows consistent way of passing device kind (either device-kind or slave-kind) to userspace. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12rtnetlink: fix userspace API breakage for iproute2 < v3.9.0Michal Schmidt
When running RHEL6 userspace on a current upstream kernel, "ip link" fails to show VF information. The reason is a kernel<->userspace API change introduced by commit 88c5b5ce5cb57 ("rtnetlink: Call nlmsg_parse() with correct header length"), after which the kernel does not see iproute2's IFLA_EXT_MASK attribute in the netlink request. iproute2 adjusted for the API change in its commit 63338dca4513 ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter"). The problem has been noticed before: http://marc.info/?l=linux-netdev&m=136692296022182&w=2 (Subject: Re: getting VF link info seems to be broken in 3.9-rc8) We can do better than tell those with old userspace to upgrade. We can recognize the old iproute2 in the kernel by checking the netlink message length. Even when including the IFLA_EXT_MASK attribute, its netlink message is shorter than struct ifinfomsg. With this patch "ip link" shows VF information in both old and new iproute2 versions. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: net/core/rtnetlink.c net/core/skbuff.c Both conflicts were very simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net/core: Add VF link state control policyDoug Ledford
Commit 1d8faf48c7 (net/core: Add VF link state control) added VF link state control to the netlink VF nested structure, but failed to add a proper entry for the new structure into the VF policy table. Add the missing entry so the table and the actual data copied into the netlink nested struct are in sync. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-08net: force a list_del() in unregister_netdevice_many()Eric Dumazet
unregister_netdevice_many() API is error prone and we had too many bugs because of dangling LIST_HEAD on stacks. See commit f87e6f47933e3e ("net: dont leave active on stack LIST_HEAD") In fact, instead of making sure no caller leaves an active list_head, just force a list_del() in the callee. No one seems to need to access the list after unregister_netdevice_many() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: include/net/inetpeer.h net/ipv6/output_core.c Changes in net were fixing bugs in code removed in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-03rtnetlink: fix a memory leak when ->newlink failsCong Wang
It is possible that ->newlink() fails before registering the device, in this case we should just free it, it's safe to call free_netdev(). Fixes: commit 0e0eee2465df77bcec2 (net: correct error path in rtnl_newlink()) Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/bonding/bond_alb.c drivers/net/ethernet/altera/altera_msgdma.c drivers/net/ethernet/altera/altera_sgdma.c net/ipv6/xfrm6_output.c Several cases of overlapping changes. The xfrm6_output.c has a bug fix which overlaps the renaming of skb->local_df to skb->ignore_df. In the Altera TSE driver cases, the register access cleanups in net-next overlapped with bug fixes done in net. Similarly a bug fix to send ALB packets in the bonding driver using the right source address overlaps with cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-23net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate ↵Sucheta Chakraborty
through ip tool. o min_tx_rate puts lower limit on the VF bandwidth. VF is guaranteed to have a bandwidth of at least this value. max_tx_rate puts cap on the VF bandwidth. VF can have a bandwidth of up to this value. o A new handler set_vf_rate for attr IFLA_VF_RATE has been introduced which takes 4 arguments: netdev, VF number, min_tx_rate, max_tx_rate o ndo_set_vf_rate replaces ndo_set_vf_tx_rate handler. o Drivers that currently implement ndo_set_vf_tx_rate should now call ndo_set_vf_rate instead and reject attempt to set a minimum bandwidth greater than 0 for IFLA_VF_TX_RATE when IFLA_VF_RATE is not yet implemented by driver. o If user enters only one of either min_tx_rate or max_tx_rate, then, userland should read back the other value from driver and set both for IFLA_VF_RATE. Drivers that have not yet implemented IFLA_VF_RATE should always return min_tx_rate as 0 when read from ip tool. o If both IFLA_VF_TX_RATE and IFLA_VF_RATE options are specified, then IFLA_VF_RATE should override. o Idea is to have consistent display of rate values to user. o Usage example: - ./ip link set p4p1 vf 0 rate 900 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 900 (Mbps), max_tx_rate 900Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 300 min_tx_rate 200 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f0 brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5a, tx rate 300 (Mbps), max_tx_rate 300Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a ./ip link set p4p1 vf 0 max_tx_rate 600 rate 300 ./ip link show p4p1 32: p4p1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT qlen 1000 link/ether 00:0e:1e:08:b0:f brd ff:ff:ff:ff:ff:ff vf 0 MAC 3e:a0:ca:bd:ae:5, tx rate 600 (Mbps), max_tx_rate 600Mbps, min_tx_rate 200Mbps vf 1 MAC f6:c6:7c:3f:3d:6c vf 2 MAC 56:32:43:98:d7:71 vf 3 MAC d6:be:c3:b5:85:ff vf 4 MAC ee:a9:9a:1e:19:14 vf 5 MAC 4a:d0:4c:07:52:18 vf 6 MAC 3a:76:44:93:62:f9 vf 7 MAC 82:e9:e7:e3:15:1a Signed-off-by: Sucheta Chakraborty <sucheta.chakraborty@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-15rtnetlink: wait for unregistering devices in rtnl_link_unregister()Cong Wang
From: Cong Wang <cwang@twopensource.com> commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no devices are unregistering) introduced rtnl_lock_unregistering() for default_device_exit_batch(). Same race could happen we when rmmod a driver which calls rtnl_link_unregister() as we call dev->destructor without rtnl lock. For long term, I think we should clean up the mess of netdev_run_todo() and net namespce exit code. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is setDavid Gibson
Since 115c9b81928360d769a76c632bae62d15206a94a (rtnetlink: Fix problem with buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST attribute if they were solicited by a GETLINK message containing an IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag. That was done because some user programs broke when they received more data than expected - because IFLA_VFINFO_LIST contains information for each VF it can become large if there are many VFs. However, the IFLA_VF_PORTS attribute, supplied for devices which implement ndo_get_vf_port (currently the 'enic' driver only), has the same problem. It supplies per-VF information and can therefore become large, but it is not currently conditional on the IFLA_EXT_MASK value. Worse, it interacts badly with the existing EXT_MASK handling. When IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at NLMSG_GOODSIZE. If the information for IFLA_VF_PORTS exceeds this, then rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet. netlink_dump() will misinterpret this as having finished the listing and omit data for this interface and all subsequent ones. That can cause getifaddrs(3) to enter an infinite loop. This patch addresses the problem by only supplying IFLA_VF_PORTS when IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24rtnetlink: Warn when interface's information won't fit in our packetDavid Gibson
Without IFLA_EXT_MASK specified, the information reported for a single interface in response to RTM_GETLINK is expected to fit within a netlink packet of NLMSG_GOODSIZE. If it doesn't, however, things will go badly wrong, When listing all interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first message in a packet as the end of the listing and omit information for that interface and all subsequent ones. This can cause getifaddrs(3) to enter an infinite loop. This patch won't fix the problem, but it will WARN_ON() making it easier to track down what's going wrong. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-24net: Use netlink_ns_capable to verify the permisions of netlink messagesEric W. Biederman
It is possible by passing a netlink socket to a more privileged executable and then to fool that executable into writing to the socket data that happens to be valid netlink message to do something that privileged executable did not intend to do. To keep this from happening replace bare capable and ns_capable calls with netlink_capable, netlink_net_calls and netlink_ns_capable calls. Which act the same as the previous calls except they verify that the opener of the socket had the desired permissions as well. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-31net-sysfs: expose number of carrier on/off changesdavid decotigny
This allows to monitor carrier on/off transitions and detect link flapping issues: - new /sys/class/net/X/carrier_changes - new rtnetlink IFLA_CARRIER_CHANGES (getlink) Tested: - grep . /sys/class/net/*/carrier_changes + ip link set dev X down/up + plug/unplug cable - updated iproute2: prints IFLA_CARRIER_CHANGES - iproute2 20121211-2 (debian): unchanged behavior Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: Documentation/devicetree/bindings/net/micrel-ks8851.txt net/core/netpoll.c The net/core/netpoll.c conflict is a bug fix in 'net' happening to code which is completely removed in 'net-next'. In micrel-ks8851.txt we simply have overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20rtnetlink: fix fdb notification flagsNicolas Dichtel
Commit 3ff661c38c84 ("net: rtnetlink notify events for FDB NTF_SELF adds and deletes") reuses the function nlmsg_populate_fdb_fill() to notify fdb events. But this function was used only for dump and thus was always setting the flag NLM_F_MULTI, which is wrong in case of a single notification. Libraries like libnl will wait forever for NLMSG_DONE. CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/bonding/bond_3ad.h drivers/net/bonding/bond_main.c Two minor conflicts in bonding, both of which were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18rtnl: make ifla_policy staticJiri Pirko
The only place this is used outside rtnetlink.c is veth. So provide wrapper function for this usage. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13net: correct error path in rtnl_newlink()Cong Wang
I saw the following BUG when ->newlink() fails in rtnl_newlink(): [ 40.240058] kernel BUG at net/core/dev.c:6438! this is due to free_netdev() is not supposed to be called before netdev is completely unregistered, therefore it is not correct to call free_netdev() here, at least for ops->newlink!=NULL case, many drivers call it in ->destructor so that rtnl_unlock() will take care of it, we probably don't need to do anything here. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-04rtnetlink: fix oops in rtnl_link_get_slave_info_data_sizeFernando Luis Vazquez Cao
We should check whether rtnetlink link operations are defined before calling get_slave_size(). Without this, the following oops can occur when adding a tap device to OVS. [ 87.839553] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 87.839595] IP: [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220 [...] [ 87.840651] Call Trace: [ 87.840664] [<ffffffff813d694b>] ? rtmsg_ifinfo+0x2b/0x100 [ 87.840688] [<ffffffff813c8340>] ? __netdev_adjacent_dev_insert+0x150/0x1a0 [ 87.840718] [<ffffffff813d6a50>] ? rtnetlink_event+0x30/0x40 [ 87.840742] [<ffffffff814b4144>] ? notifier_call_chain+0x44/0x70 [ 87.840768] [<ffffffff813c8946>] ? __netdev_upper_dev_link+0x3c6/0x3f0 [ 87.840798] [<ffffffffa0678d6c>] ? netdev_create+0xcc/0x160 [openvswitch] [ 87.840828] [<ffffffffa06781ea>] ? ovs_vport_add+0x4a/0xd0 [openvswitch] [ 87.840857] [<ffffffffa0670139>] ? new_vport+0x9/0x50 [openvswitch] [ 87.840884] [<ffffffffa067279e>] ? ovs_vport_cmd_new+0x11e/0x210 [openvswitch] [ 87.840915] [<ffffffff813f3efa>] ? genl_family_rcv_msg+0x19a/0x360 [ 87.840941] [<ffffffff813f40c0>] ? genl_family_rcv_msg+0x360/0x360 [ 87.840967] [<ffffffff813f4139>] ? genl_rcv_msg+0x79/0xc0 [ 87.840991] [<ffffffff813b6cf9>] ? __kmalloc_reserve.isra.25+0x29/0x80 [ 87.841018] [<ffffffff813f2389>] ? netlink_rcv_skb+0xa9/0xc0 [ 87.841042] [<ffffffff813f27cf>] ? genl_rcv+0x1f/0x30 [ 87.841064] [<ffffffff813f1988>] ? netlink_unicast+0xe8/0x1e0 [ 87.841088] [<ffffffff813f1d9a>] ? netlink_sendmsg+0x31a/0x750 [ 87.841113] [<ffffffff813aee96>] ? sock_sendmsg+0x86/0xc0 [ 87.841136] [<ffffffff813c960d>] ? __netdev_update_features+0x4d/0x200 [ 87.841163] [<ffffffff813ca94e>] ? ethtool_get_value+0x2e/0x50 [ 87.841188] [<ffffffff813af269>] ? ___sys_sendmsg+0x359/0x370 [ 87.841212] [<ffffffff813da686>] ? dev_ioctl+0x1a6/0x5c0 [ 87.841236] [<ffffffff8109c210>] ? autoremove_wake_function+0x30/0x30 [ 87.841264] [<ffffffff813ac59d>] ? sock_do_ioctl+0x3d/0x50 [ 87.841288] [<ffffffff813aca68>] ? sock_ioctl+0x1e8/0x2c0 [ 87.841312] [<ffffffff811934bf>] ? do_vfs_ioctl+0x2cf/0x4b0 [ 87.841335] [<ffffffff813afeb9>] ? __sys_sendmsg+0x39/0x70 [ 87.841362] [<ffffffff814b86f9>] ? system_call_fastpath+0x16/0x1b [ 87.841386] Code: c0 74 10 48 89 ef ff d0 83 c0 07 83 e0 fc 48 98 49 01 c7 48 89 ef e8 d0 d6 fe ff 48 85 c0 0f 84 df 00 00 00 48 8b 90 08 07 00 00 <48> 8b 8a a8 00 00 00 31 d2 48 85 c9 74 0c 48 89 ee 48 89 c7 ff [ 87.841529] RIP [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220 [ 87.841555] RSP <ffff880221aa5950> [ 87.841569] CR2: 00000000000000a8 [ 87.851442] ---[ end trace e42ab217691b4fc2 ]--- Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-23rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_infoJiri Pirko
This check is not needed because the same check is done before fill_slave_info is used in rtnl_link_slave_info_fill. Also, by removing this check, kernel will fillup IFLA_INFO_SLAVE_KIND even for slaves of masters which does not implement fill_slave_info. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert netlink to use slave data info apiJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>