summaryrefslogtreecommitdiffstats
path: root/net/core/rtnetlink.c
AgeCommit message (Collapse)Author
2017-07-20rtnetlink: allocate more memory for dev_set_mac_address()WANG Cong
virtnet_set_mac_address() interprets mac address as struct sockaddr, but upper layer only allocates dev->addr_len which is ETH_ALEN + sizeof(sa_family_t) in this case. We lack a unified definition for mac address, so just fix the upper layer, this also allows drivers to interpret it to struct sockaddr freely. Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-19Revert "rtnetlink: Do not generate notifications for CHANGEADDR event"David Ahern
This reverts commit cd8966e75ed3c6b41a37047a904617bc44fa481f. The duplicate CHANGEADDR event message is sent regardless of link status whereas the setlink changes only generate a notification when the link is up. Not sending a notification when the link is down breaks dhcpcd which only processes hwaddr changes when the link is down. Fixes reported regression: https://bugzilla.kernel.org/show_bug.cgi?id=196355 Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-01net: convert sk_buff.users from atomic_t to refcount_tReshetova, Elena
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David Windsor <dwindsor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.slave_validateMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.slave_changelinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.validateMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23xdp: add reporting of offload modeJakub Kicinski
Extend the XDP_ATTACHED_* values to include offloaded mode. Let drivers report whether program is installed in the driver or the HW by changing the prog_attached field from bool to u8 (type of the netlink attribute). Exploit the fact that the value of XDP_ATTACHED_DRV is 1, therefore since all drivers currently assign the mode with double negation: mode = !!xdp_prog; no drivers have to be modified. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23xdp: add HW offload mode flag for installing programsJakub Kicinski
Add an installation-time flag for requesting that the program be installed only if it can be offloaded to HW. Internally new command for ndo_xdp is added, this way we avoid putting checks into drivers since they all return -EINVAL on an unknown command. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two entries being added at the same time to the IFLA policy table, whilst parallel bug fixes to decnet routing dst handling overlapping with the dst gc removal in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-21rtnetlink: add restricted rtnl groups for ipv4 and ipv6 mrouteJulien Gomes
Add RTNLGRP_{IPV4,IPV6}_MROUTE_R as two new restricted groups for the NETLINK_ROUTE family. Binding to these groups specifically requires CAP_NET_ADMIN to allow multicast of sensitive messages (e.g. mroute cache reports). Suggested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Julien Gomes <julien@arista.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-20rtnetlink: add IFLA_GROUP to ifla_policySerhey Popovych
Network interface groups support added while ago, however there is no IFLA_GROUP attribute description in policy and netlink message size calculations until now. Add IFLA_GROUP attribute to the policy. Fixes: cbda10fa97d7 ("net_device: add support for network device groups") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-16net: Add IFLA_XDP_PROG_IDMartin KaFai Lau
Expose prog_id through IFLA_XDP_PROG_ID. This patch makes modification to generic_xdp. The later patches will modify other xdp-supported drivers. prog_id is added to struct net_dev_xdp. iproute2 patch will be followed. Here is how the 'ip link' will look like: > ip link show eth0 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp(prog_id:1) qdisc fq_codel state UP mode DEFAULT group default qlen 1000 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Alexei Starovoitov <ast@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08net: Zero ifla_vf_info in rtnl_fill_vfinfo()Mintz, Yuval
Some of the structure's fields are not initialized by the rtnetlink. If driver doesn't set those in ndo_get_vf_config(), they'd leak memory to user. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-31rtnetlink: use the new rtnl_get_event() interfaceVlad Yasevich
Small clean-up to rtmsg_ifinfo() to use the rtnl_get_event() interface instead of using 'internal' values directly. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-27rtnl: Add support for netdev event to link messagesVlad Yasevich
When netdev events happen, a rtnetlink_event() handler will send messages for every event in it's white list. These messages contain current information about a particular device, but they do not include the iformation about which event just happened. So, it is impossible to tell what just happend for these events. This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT that would have an encoding of event that triggered this message. This would allow the the message consumer to easily determine if it needs to perform certain actions. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Overlapping changes in drivers/net/phy/marvell.c, bug fix in 'net' restricting a HW workaround alongside cleanups in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-24net: rtnetlink: bail out from rtnl_fdb_dump() on parse errorAlexander Potapenko
rtnl_fdb_dump() failed to check the result of nlmsg_parse(), which led to contents of |ifm| being uninitialized because nlh->nlmsglen was too small to accommodate |ifm|. The uninitialized data may affect some branches and result in unwanted effects, although kernel data doesn't seem to leak to the userspace directly. The bug has been detected with KMSAN and syzkaller. For the record, here is the KMSAN report: ================================================================== BUG: KMSAN: use of unitialized memory in rtnl_fdb_dump+0x5dc/0x1000 CPU: 0 PID: 1039 Comm: probe Not tainted 4.11.0-rc5+ #2727 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007 __kmsan_warning_32+0x66/0xb0 mm/kmsan/kmsan_instr.c:491 rtnl_fdb_dump+0x5dc/0x1000 net/core/rtnetlink.c:3230 netlink_dump+0x84f/0x1190 net/netlink/af_netlink.c:2168 __netlink_dump_start+0xc97/0xe50 net/netlink/af_netlink.c:2258 netlink_dump_start ./include/linux/netlink.h:165 rtnetlink_rcv_msg+0xae9/0xb40 net/core/rtnetlink.c:4094 netlink_rcv_skb+0x339/0x5a0 net/netlink/af_netlink.c:2339 rtnetlink_rcv+0x83/0xa0 net/core/rtnetlink.c:4110 netlink_unicast_kernel net/netlink/af_netlink.c:1272 netlink_unicast+0x13b7/0x1480 net/netlink/af_netlink.c:1298 netlink_sendmsg+0x10b8/0x10f0 net/netlink/af_netlink.c:1844 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 ___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997 __sys_sendmsg net/socket.c:2031 SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042 SyS_sendmsg+0x87/0xb0 net/socket.c:2038 do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x401300 RSP: 002b:00007ffc3b0e6d58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401300 RDX: 0000000000000000 RSI: 00007ffc3b0e6d80 RDI: 0000000000000003 RBP: 00007ffc3b0e6e00 R08: 000000000000000b R09: 0000000000000004 R10: 000000000000000d R11: 0000000000000246 R12: 0000000000000000 R13: 00000000004065a0 R14: 0000000000406630 R15: 0000000000000000 origin: 000000008fe00056 save_stack_trace+0x59/0x60 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:352 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:247 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:260 slab_alloc_node mm/slub.c:2743 __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4349 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 netlink_alloc_large_skb net/netlink/af_netlink.c:1144 netlink_sendmsg+0x934/0x10f0 net/netlink/af_netlink.c:1819 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 ___sys_sendmsg+0xd4b/0x10f0 net/socket.c:1997 __sys_sendmsg net/socket.c:2031 SYSC_sendmsg+0x2c6/0x3f0 net/socket.c:2042 SyS_sendmsg+0x87/0xb0 net/socket.c:2038 do_syscall_64+0x102/0x150 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 ================================================================== and the reproducer: ================================================================== #include <sys/socket.h> #include <net/if_arp.h> #include <linux/netlink.h> #include <stdint.h> int main() { int sock = socket(PF_NETLINK, SOCK_DGRAM | SOCK_NONBLOCK, 0); struct msghdr msg; memset(&msg, 0, sizeof(msg)); char nlmsg_buf[32]; memset(nlmsg_buf, 0, sizeof(nlmsg_buf)); struct nlmsghdr *nlmsg = nlmsg_buf; nlmsg->nlmsg_len = 0x11; nlmsg->nlmsg_type = 0x1e; // RTM_NEWROUTE = RTM_BASE + 0x0e // type = 0x0e = 1110b // kind = 2 nlmsg->nlmsg_flags = 0x101; // NLM_F_ROOT | NLM_F_REQUEST nlmsg->nlmsg_seq = 0; nlmsg->nlmsg_pid = 0; nlmsg_buf[16] = (char)7; struct iovec iov; iov.iov_base = nlmsg_buf; iov.iov_len = 17; msg.msg_iov = &iov; msg.msg_iovlen = 1; sendmsg(sock, &msg, 0); return 0; } ================================================================== Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-05-18net: make struct net_device::tx_queue_len unsigned intAlexey Dobriyan
4 billion packet queue is something unthinkable so use 32-bit value for now. Space savings on x86_64: add/remove: 0/0 grow/shrink: 3/70 up/down: 16/-131 (-115) function old new delta change_tx_queue_len 94 108 +14 qdisc_create 1176 1177 +1 alloc_netdev_mqs 1124 1125 +1 xenvif_alloc 533 532 -1 x25_asy_setup 167 166 -1 ... tun_queue_resize 945 940 -5 pfifo_fast_enqueue 167 162 -5 qfq_init_qdisc 168 158 -10 tap_queue_resize 810 799 -11 transmit 719 698 -21 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-16net: Improve handling of failures on link and route dumpsDavid Ahern
In general, rtnetlink dumps do not anticipate failure to dump a single object (e.g., link or route) on a single pass. As both route and link objects have grown via more attributes, that is no longer a given. netlink dumps can handle a failure if the dump function returns an error; specifically, netlink_dump adds the return code to the response if it is <= 0 so userspace is notified of the failure. The missing piece is the rtnetlink dump functions returning the error. Fix route and link dump functions to return the errors if no object is added to an skb (detected by skb->len != 0). IPv6 route dumps (rt6_dump_route) already return the error; this patch updates IPv4 and link dumps. Other dump functions may need to be ajusted as well. Reported-by: Jan Moskyto Matejka <mq@ucw.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-11xdp: refine xdp api with regards to generic xdpDaniel Borkmann
While working on the iproute2 generic XDP frontend, I noticed that as of right now it's possible to have native *and* generic XDP programs loaded both at the same time for the case when a driver supports native XDP. The intended model for generic XDP from b5cdae3291f7 ("net: Generic XDP") is, however, that only one out of the two can be present at once which is also indicated as such in the XDP netlink dump part. The main rationale for generic XDP is to ease accessibility (in case a driver does not yet have XDP support) and to generically provide a semantical model as an example for driver developers wanting to add XDP support. The generic XDP option for an XDP aware driver can still be useful for comparing and testing both implementations. However, it is not intended to have a second XDP processing stage or layer with exactly the same functionality of the first native stage. Only reason could be to have a partial fallback for future XDP features that are not supported yet in the native implementation and we probably also shouldn't strive for such fallback and instead encourage native feature support in the first place. Given there's currently no such fallback issue or use case, lets not go there yet if we don't need to. Therefore, change semantics for loading XDP and bail out if the user tries to load a generic XDP program when a native one is present and vice versa. Another alternative to bailing out would be to handle the transition from one flavor to another gracefully, but that would require to bring the device down, exchange both types of programs, and bring it up again in order to avoid a tiny window where a packet could hit both hooks. Given this complicates the logic for just a debugging feature in the native case, I went with the simpler variant. For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae3291f7 and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all or just a subset of flags that were used for loading the XDP prog is suboptimal in the long run since not all flags are useful for dumping and if we start to reuse the same flag definitions for load and dump, then we'll waste bit space. What we really just want is to dump the mode for now. Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0), a program is running at the native driver layer (1). Thus, add a mode that says that a program is running at generic XDP layer (2). Applications will handle this fine in that older binaries will just indicate that something is attached at XDP layer, effectively this is similar to IFLA_XDP_FLAGS attr that we would have had modulo the redundancy. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-11xdp: add flag to enforce driver modeDaniel Borkmann
After commit b5cdae3291f7 ("net: Generic XDP") we automatically fall back to a generic XDP variant if the driver does not support native XDP. Allow for an option where the user can specify that always the native XDP variant should be selected and in case it's not supported by a driver, just bail out. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-04rtnetlink: NUL-terminate IFLA_PHYS_PORT_NAME stringMichal Schmidt
IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0. Otherwise libnl3 fails to validate netlink messages with this attribute. "ip -detail a" assumes too that the attribute is NUL-terminated when printing it. It often was, due to padding. I noticed this as libvirtd failing to start on a system with sfc driver after upgrading it to Linux 4.11, i.e. when sfc added support for phys_port_name. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01xdp: propagate extended ack to XDP setupJakub Kicinski
Drivers usually have a number of restrictions for running XDP - most common being buffer sizes, LRO and number of rings. Even though some drivers try to be helpful and print error messages experience shows that users don't often consult kernel logs on netlink errors. Try to use the new extended ack mechanism to carry the message back to user space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25net: Generic XDPDavid S. Miller
This provides a generic SKB based non-optimized XDP path which is used if either the driver lacks a specific XDP implementation, or the user requests it via a new IFLA_XDP_FLAGS value named XDP_FLAGS_SKB_MODE. It is arguable that perhaps I should have required something like this as part of the initial XDP feature merge. I believe this is critical for two reasons: 1) Accessibility. More people can play with XDP with less dependencies. Yes I know we have XDP support in virtio_net, but that just creates another depedency for learning how to use this facility. I wrote this to make life easier for the XDP newbies. 2) As a model for what the expected semantics are. If there is a pure generic core implementation, it serves as a semantic example for driver folks adding XDP support. One thing I have not tried to address here is the issue of XDP_PACKET_HEADROOM, thanks to Daniel for spotting that. It seems incredibly expensive to do a skb_cow(skb, XDP_PACKET_HEADROOM) or whatever even if the XDP program doesn't try to push headers at all. I think we really need the verifier to somehow propagate whether certain XDP helpers are used or not. v5: - Handle both negative and positive offset after running prog - Fix mac length in XDP_TX case (Alexei) - Use rcu_dereference_protected() in free_netdev (kbuild test robot) v4: - Fix MAC header adjustmnet before calling prog (David Ahern) - Disable LRO when generic XDP is installed (Michael Chan) - Bypass qdisc et al. on XDP_TX and record the event (Alexei) - Do not perform generic XDP on reinjected packets (DaveM) v3: - Make sure XDP program sees packet at MAC header, push back MAC header if we do XDP_TX. (Alexei) - Elide GRO when generic XDP is in use. (Alexei) - Add XDP_FLAG_SKB_MODE flag which the user can use to request generic XDP even if the driver has an XDP implementation. (Alexei) - Report whether SKB mode is in use in rtnl_xdp_fill() via XDP_FLAGS attribute. (Daniel) v2: - Add some "fall through" comments in switch statements based upon feedback from Andrew Lunn - Use RCU for generic xdp_prog, thanks to Johannes Berg. Tested-by: Andy Gospodarek <andy@greyhouse.net> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net: rtnetlink: plumb extended ack to doit functionDavid Ahern
Add netlink_ext_ack arg to rtnl_doit_func. Pass extack arg to nlmsg_parse for doit functions that call it directly. This is the first step to using extended error reporting in rtnetlink. >From here individual subsystems can be updated to set netlink_ext_ack as needed. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: pass extended ACK struct to parsing functionsJohannes Berg
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: extended ACK reportingJohannes Berg
Add the base infrastructure and UAPI for netlink extended ACK reporting. All "manual" calls to netlink_ack() pass NULL for now and thus don't get extended ACK reporting. Big thanks goes to Pablo Neira Ayuso for not only bringing up the whole topic at netconf (again) but also coming up with the nlattr passing trick and various other ideas. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for NETDEV_CHANGE_TX_QUEUE_LEN eventDavid Ahern
Changing tx queue length generates identical messages: [LINK]22: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:04:f4:b7:5c:d2 brd ff:ff:ff:ff:ff:ff promiscuity 0 dummy numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 [LINK]22: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:04:f4:b7:5c:d2 brd ff:ff:ff:ff:ff:ff promiscuity 0 dummy numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 Remove NETDEV_CHANGE_TX_QUEUE_LEN from the list of notifiers that generate notifications. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for NETDEV_CHANGEUPPER eventDavid Ahern
NETDEV_CHANGEUPPER is an internal event; do not generate userspace notifications. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for CHANGELOWERSTATE eventDavid Ahern
CHANGELOWERSTATE is an internal event; do not generate userspace notifications. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for PRECHANGEUPPER eventDavid Ahern
PRECHANGEUPPER is an internal event; do not generate userspace notifications. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for POST_TYPE_CHANGE eventDavid Ahern
Changing the master device for a link generates many messages; the one generated for POST_TYPE_CHANGE is redundant: [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UNKNOWN group default link/ether 02:02:02:02:02:03 brd ff:ff:ff:ff:ff:ff [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue master br1 state UNKNOWN group default link/ether 02:02:02:02:02:03 brd ff:ff:ff:ff:ff:ff Remove POST_TYPE_CHANGE from the list of notifiers that generate notifications. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for CHANGEADDR eventDavid Ahern
Changing hardware address generates redundant messages: [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff Do not send a notification for the CHANGEADDR notifier. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notification for UDP_TUNNEL_PUSH_INFODavid Ahern
NETDEV_UDP_TUNNEL_PUSH_INFO is an internal notifier; nothing userspace can do so don't generate a netlink notification. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13rtnetlink: Do not generate notifications for MTU eventsDavid Ahern
Changing MTU on a link currently causes 3 messages to be sent to userspace: [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1490 qdisc noqueue state UNKNOWN group default link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff [LINK]11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether f2:52:5c:6d:21:f3 brd ff:ff:ff:ff:ff:ff Remove the messages sent for PRE_CHANGE_MTU and CHANGE_MTU netdev events. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-09Revert "rtnl: Add support for netdev event to link messages"David S. Miller
This reverts commit def12888c161e6fec0702e5ec9c3962846e3a21d. As per discussion between Roopa Prabhu and David Ahern, it is advisable that we instead have the code collect the setlink triggered events into a bitmask emitted in the IFLA_EVENT netlink attribute. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05rtnl: Add support for netdev event to link messagesVlad Yasevich
When netdev events happen, a rtnetlink_event() handler will send messages for every event in it's white list. These messages contain current information about a particular device, but they do not include the iformation about which event just happened. The consumer of the message has to try to infer this information. In some cases (ex: NETDEV_NOTIFY_PEERS), that is not possible. This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT that would have an encoding of the which event triggered this message. This would allow the the message consumer to easily determine if it is interested in a particular event or not. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05rtnetlink: Convert rtnetlink_event to white listVlad Yasevich
The rtnetlink_event currently functions as a blacklist where we block cerntain netdev events from being sent to user space. As a result, events have been added to the system that userspace probably doesn't care about. This patch converts the implementation to the white list so that newly events would have to be specifically added to the list to be sent to userspace. This would force new event implementers to consider whether a given event is usefull to user space or if it's just a kernel event. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-22rtnetlink: Add dump all for netconfDavid Ahern
Use the rtnl_dump_all to dump all netconf handlers that have been registered. Allows userspace to send a dump request for PF_UNSPEC and get all families. Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-21rtnl: simplify error return path in rtnl_create_link()Tobias Klauser
There is only one possible error path which reaches the err label, so return ERR_PTR(-ENOMEM) directly if alloc_netdev_mqs() fails. This also allows to omit the err variable. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-17rtnl: don't account unused struct ifla_port_vsi in rtnl_port_sizeDaniel Borkmann
When allocating rtnl dump messages, struct ifla_port_vsi is never dumped, so we can save header plus payload in rtnl_port_size(). Infact, attribute IFLA_PORT_VSI_TYPE and struct ifla_port_vsi are not used anywhere in the kernel. We only need to keep the nla policy should applications in user space be filling this out. Same NLA_BINARY issue exists as was fixed in 364d5716a7ad ("rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY") and others, but then again IFLA_PORT_VSI_TYPE is not used anywhere, so just add a comment that it's unused. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-01rtnetlink: Handle IFLA_MASTER parameter when processing rtnl_newlinkTheuns Verwoerd
Allow a master interface to be specified as one of the parameters when creating a new interface via rtnl_newlink. Previously this would require invoking interface creation, waiting for it to complete, and then separately binding that new interface to a master. In particular, this is used when creating a macvlan child interface for VRRP in a VRF configuration, allowing the interface creator to specify directly what master interface should be inherited by the child, without having to deal with asynchronous complications and potential race conditions. Signed-off-by: Theuns Verwoerd <theuns.verwoerd@alliedtelesis.co.nz> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20device: Implement a bus agnostic dev_num_vf routinePhil Sutter
Now that pci_bus_type has num_vf callback set, dev_num_vf can be implemented in a bus type independent way and the check for whether a PCI device is being handled in rtnetlink can be dropped. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17net: AF-specific RTM_GETSTATS attributesRobert Shearman
Add the functionality for including address-family-specific per-link stats in RTM_GETSTATS messages. This is done through adding a new IFLA_STATS_AF_SPEC attribute under which address family attributes are nested and then the AF-specific attributes can be further nested. This follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the advantage of presenting an easily extended hierarchy. The rtnl_af_ops structure is extended to provide AFs with the opportunity to fill and provide the size of their stats attributes. One alternative would have been to provide AFs with the ability to add attributes directly into the RTM_GETSTATS message without a nested hierarchy. I discounted this approach as it increases the rate at which the 32 attribute number space is used up and it makes implementation a little more tricky for stats dump resuming (at the moment the order in which attributes are added to the message has to match the numeric order of the attributes). Another alternative would have been to register per-AF RTM_GETSTATS handlers. I discounted this approach as I perceived a common use-case to be getting all the stats for an interface and this approach would necessitate multiple requests/dumps to retrieve them all. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-29rtnl: stats - add missing netlink message size checksMathias Krause
We miss to check if the netlink message is actually big enough to contain a struct if_stats_msg. Add a check to prevent userland from sending us short messages that would make us access memory beyond the end of the message. Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump...") Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-24Replace <asm/uaccess.h> with <linux/uaccess.h> globallyLinus Torvalds
This was entirely automated, using the script by Al: PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>