summaryrefslogtreecommitdiffstats
path: root/drivers/net/macvlan.c
AgeCommit message (Collapse)Author
2018-05-30macvlan: fix use-after-free in macvlan_common_newlink()Alexey Kodanev
[ Upstream commit 4e14bf4236490306004782813b8b4494b18f5e60 ] The following use-after-free was reported by KASan when running LTP macvtap01 test on 4.16-rc2: [10642.528443] BUG: KASAN: use-after-free in macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10642.626607] Read of size 8 at addr ffff880ba49f2100 by task ip/18450 ... [10642.963873] Call Trace: [10642.994352] dump_stack+0x5c/0x7c [10643.035325] print_address_description+0x75/0x290 [10643.092938] kasan_report+0x28d/0x390 [10643.137971] ? macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10643.207963] macvlan_common_newlink+0x12ef/0x14a0 [macvlan] [10643.275978] macvtap_newlink+0x171/0x260 [macvtap] [10643.334532] rtnl_newlink+0xd4f/0x1300 ... [10646.256176] Allocated by task 18450: [10646.299964] kasan_kmalloc+0xa6/0xd0 [10646.343746] kmem_cache_alloc_trace+0xf1/0x210 [10646.397826] macvlan_common_newlink+0x6de/0x14a0 [macvlan] [10646.464386] macvtap_newlink+0x171/0x260 [macvtap] [10646.522728] rtnl_newlink+0xd4f/0x1300 ... [10647.022028] Freed by task 18450: [10647.061549] __kasan_slab_free+0x138/0x180 [10647.111468] kfree+0x9e/0x1c0 [10647.147869] macvlan_port_destroy+0x3db/0x650 [macvlan] [10647.211411] rollback_registered_many+0x5b9/0xb10 [10647.268715] rollback_registered+0xd9/0x190 [10647.319675] register_netdevice+0x8eb/0xc70 [10647.370635] macvlan_common_newlink+0xe58/0x14a0 [macvlan] [10647.437195] macvtap_newlink+0x171/0x260 [macvtap] Commit d02fd6e7d293 ("macvlan: Fix one possible double free") handles the case when register_netdevice() invokes ndo_uninit() on error and as a result free the port. But 'macvlan_port_get_rtnl(dev))' check (returns dev->rx_handler_data), which was added by this commit in order to prevent double free, is not quite correct: * for macvlan it always returns NULL because 'lowerdev' is the one that was used to register rx handler (port) in macvlan_port_create() as well as to unregister it in macvlan_port_destroy(). * for macvtap it always returns a valid pointer because macvtap registers its own rx handler before macvlan_common_newlink(). Fixes: d02fd6e7d293 ("macvlan: Fix one possible double free") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-31macvlan: filter out unsupported feature flagsShannon Nelson
[ Upstream commit 13fbcc8dc573482dd3f27568257fd7087f8935f4 ] Adding a macvlan device on top of a lowerdev that supports the xfrm offloads fails with a new regression: # ip link add link ens1f0 mv0 type macvlan RTNETLINK answers: Operation not permitted Tracing down the failure shows that the macvlan device inherits the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags from the lowerdev, but with no dev->xfrmdev_ops API filled in, it doesn't actually support xfrm. When the request is made to add the new macvlan device, the XFRM listener for NETDEV_REGISTER calls xfrm_api_check() which fails the new registration because dev->xfrmdev_ops is NULL. The macvlan creation succeeds when we filter out the ESP feature flags in macvlan_fix_features(), so let's filter them out like we're already filtering out ~NETIF_F_NETNS_LOCAL. When XFRM support is added in the future, we can add the flags into MACVLAN_FEATURES. This same problem could crop up in the future with any other new feature flags, so let's filter out any flags that aren't defined as supported in macvlan. Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03macvlan: Fix one possible double freeGao Feng
[ Upstream commit d02fd6e7d2933ede6478a15f9e4ce8a93845824e ] Because the macvlan_uninit would free the macvlan port, so there is one double free case in macvlan_common_newlink. When the macvlan port is just created, then register_netdevice or netdev_upper_dev_link failed and they would invoke macvlan_uninit. Then it would reach the macvlan_port_destroy which triggers the double free. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20macvlan: Only deliver one copy of the frame to the macvlan interfaceAlexander Duyck
[ Upstream commit dd6b9c2c332b40f142740d1b11fb77c653ff98ea ] This patch intoduces a slight adjustment for macvlan to address the fact that in source mode I was seeing two copies of any packet addressed to the macvlan interface being delivered where there should have been only one. The issue appears to be that one copy was delivered based on the source MAC address and then the second copy was being delivered based on the destination MAC address. To fix it I am just treating a unicast address match as though it is not a match since source based macvlan isn't supposed to be matching based on the destination MAC anyway. Fixes: 79cf79abce71 ("macvlan: add source mode") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-18macvlan: add offload features for encapsulationDimitris Michailidis
Currently macvlan devices do not set their hw_enc_features making encapsulated Tx packets resort to SW fallbacks. Add encapsulation GSO offloads to ->features as is done for the other GSOs and set ->hw_enc_features. Signed-off-by: Dimitris Michailidis <dmichail@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17macvlan/macvtap: Remove NETIF_F_UFO advertisement.David S. Miller
It is going away. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.validateMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.changelinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-26net: add netlink_ext_ack argument to rtnl_link_ops.newlinkMatthias Schiffer
Add support for extended error reporting. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22macvlan: Let passthru macvlan correctly restore lower mac addressVlad Yasevich
Passthru macvlans directly change the mac address of the lower level device. That's OK, but after the macvlan is deleted, the lower device is left with changed address and one needs to reboot to bring back the origina HW addresses. This scenario is actually quite common with passthru macvtap devices. This patch attempts to solve this, by storing the mac address of the lower device in macvlan_port structure and keeping track of it through the changes. After this patch, any changes to the lower device mac address done trough the macvlan device, will be reverted back. Any changs done directly to the lower device mac address will be kept. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22macvlan: convert port passthru to flags.Vlad Yasevich
Convert the port passthru boolean into flags with accesor functions. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22macvlan: Fix passthru macvlan mac address inheritanceVlad Yasevich
When a lower device of the passthru macvlan changes it's address, passthru macvlan is supposed to change it's own address as well. However, that doesn't happen correctly because the check in macvlan_addr_busy() will catch the fact that the lower level (port) mac address is the same as the address we are trying to assign to the macvlan, and return an error. As a reasult, the address of the passthru macvlan device is never changed. The same thing happens when the user attempts to change the mac address of the passthru macvlan. The simple solution appers to be to not check against the lower device in case of passthru macvlan device, since the 2 addresses are _supposed_ to be the same. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-22macvlan: Do not return error when setting the same mac addressVlad Yasevich
The user currently gets an EBUSY error when attempting to set the mac address on a macvlan device to the same value. This should really be a no-op as nothing changes. Catch the condition and return early. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The conflicts were two cases of overlapping changes in batman-adv and the qed driver. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-14macvlan: propagate the mac address change status for lowerdevZhang Shengju
The macvlan dev should propagate the return value of mac address change for lower device in the passthru mode, instead of always return 0. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-07net: Fix inconsistent teardown and release of private netdev state.David S. Miller
Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-15macvlan: Fix performance issues with vlan tagged packetsVlad Yasevich
Macvlan always turns on offload features that have sofware fallback (NETIF_GSO_SOFTWARE). This allows much higher guest-guest communications over macvtap. However, macvtap does not turn on these features for vlan tagged traffic. As a result, depending on the HW that mactap is configured on, the performance of guest-guest communication over a vlan is very inconsistent. If the HW supports TSO/UFO over vlans, then the performance will be fine. If not, the the performance will suffer greatly since the VM may continue using TSO/UFO, and will force the host segment the traffic and possibly overlow the macvtap queue. This patch adds the always on offloads to vlan_features. This makes sure that any vlan tagged traffic between 2 guest will not be segmented needlessly. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-25macvlan: Fix device ref leak when purging bc_queueHerbert Xu
When a parent macvlan device is destroyed we end up purging its broadcast queue without dropping the device reference count on the packet source device. This causes the source device to linger. This patch drops that reference count. Fixes: 260916dfb48c ("macvlan: Fix potential use-after free for...") Reported-by: Joe Ghalam <Joe.Ghalam@dell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-11tap: Abstract type of virtual interface from tap implementationSainath Grandhi
macvlan object is re-structured to hold tap related elements in a separate entity, tap_dev. Upon NETDEV_REGISTER device_event, tap_dev is registered with idr and fetched again on tap_open. Few of the tap functions are modified to accepted tap_dev as argument. tap_dev object includes callbacks to be used by underlying virtual interface to take care of tx and rx accounting. Signed-off-by: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20macvlan: use netdev_is_rx_handler_busy instead of checking specific typeMahesh Bandewar
netdev_is_rx_handler_busy() check is a superset of netif_is_ipvlan_port() check and hence should be preferred. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08net: make ndo_get_stats64 a void functionstephen hemminger
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-07driver: macvlan: Remove the rcu member of macvlan_portGao Feng
When free macvlan_port in macvlan_port_destroy, it is safe to free directly because netdev_rx_handler_unregister could enforce one grace period. So it is unnecessary to use kfree_rcu for macvlan_port. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
udplite conflict is resolved by taking what 'net-next' did which removed the backlog receive method assignment, since it is no longer necessary. Two entries were added to the non-priv ethtool operations switch statement, one in 'net' and one in 'net-next, so simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-23driver: macvlan: Check if need rollback multicast setting in macvlan_openGao Feng
When dev_set_promiscuity failed in macvlan_open, it always invokes dev_set_allmulti without checking if necessary. Now check the IFF_ALLMULTI flag firstly before rollback the multicast setting in the error handler. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-21driver: macvlan: Remove duplicated IFF_UP condition check in ↵Gao Feng
macvlan_forward_source The function macvlan_forward_source_one has already checked the flag IFF_UP, so needn't check it outside in macvlan_forward_source too. Signed-off-by: Gao Feng <gfree.wind@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Several cases of bug fixes in 'net' overlapping other changes in 'net-next-. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-14driver: macvlan: Replace integer number with bool valueGao Feng
The return value of function macvlan_addr_busy is used as bool value, so use bool value instead of integer number "1" and "0". Signed-off-by: Gao Feng <gfree.wind@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-09driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.Gao Feng
When there is no existing macvlan port in lowdev, one new macvlan port would be created. But it doesn't be destoried when something failed later. It casues some memleak. Now add one flag to indicate if new macvlan port is created. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20net: use core MTU range checking in core net infraJarod Wilson
geneve: - Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu - This one isn't quite as straight-forward as others, could use some closer inspection and testing macvlan: - set min/max_mtu tun: - set min/max_mtu, remove tun_net_change_mtu vxlan: - Merge __vxlan_change_mtu back into vxlan_change_mtu - Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function - This one is also not as straight-forward and could use closer inspection and testing from vxlan folks bridge: - set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in change_mtu function openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported sch_teql: - set min/max_mtu (note: max_mtu previously unchecked, used max of 65535) macsec: - min_mtu = 0, max_mtu = 65535 macvlan: - min_mtu = 0, max_mtu = 65535 ntb_netdev: - min_mtu = 0, max_mtu = 65535 veth: - min_mtu = 68, max_mtu = 65535 8021q: - min_mtu = 0, max_mtu = 65535 CC: netdev@vger.kernel.org CC: Nicolas Dichtel <nicolas.dichtel@6wind.com> CC: Hannes Frederic Sowa <hannes@stressinduktion.org> CC: Tom Herbert <tom@herbertland.com> CC: Daniel Borkmann <daniel@iogearbox.net> CC: Alexander Duyck <alexander.h.duyck@intel.com> CC: Paolo Abeni <pabeni@redhat.com> CC: Jiri Benc <jbenc@redhat.com> CC: WANG Cong <xiyou.wangcong@gmail.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> CC: Pravin B Shelar <pshelar@ovn.org> CC: Sabrina Dubroca <sd@queasysnail.net> CC: Patrick McHardy <kaber@trash.net> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Pravin Shelar <pshelar@nicira.com> CC: Maxim Krasnyansky <maxk@qti.qualcomm.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-13net: remove type_check from dev_get_nest_level()Sabrina Dubroca
The idea for type_check in dev_get_nest_level() was to count the number of nested devices of the same type (currently, only macvlan or vlan devices). This prevented the false positive lockdep warning on configurations such as: eth0 <--- macvlan0 <--- vlan0 <--- macvlan1 However, this doesn't prevent a warning on a configuration such as: eth0 <--- macvlan0 <--- vlan0 eth1 <--- vlan1 <--- macvlan1 In this case, all the locks end up with a nesting subclass of 1, so lockdep thinks that there is still a deadlock: - in the first case we have (macvlan_netdev_addr_lock_key, 1) and then take (vlan_netdev_xmit_lock_key, 1) - in the second case, we have (vlan_netdev_xmit_lock_key, 1) and then take (macvlan_netdev_addr_lock_key, 1) By removing the linktype check in dev_get_nest_level() and always incrementing the nesting depth, lockdep considers this configuration valid. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09net: macvlan: call netdev_lockdep_set_classes()Eric Dumazet
In case a qdisc is used on a macvlan device, we need to use different lockdep classes to avoid false positives. Use the new netdev_lockdep_set_classes() generic helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-01macvlan: Avoid unnecessary multicast cloningHerbert Xu
Currently we always queue a multicast packet for further processing, even if none of the macvlan devices are subscribed to the address. This patch optimises this by adding a global multicast filter for a macvlan_port. Note that this patch doesn't handle the broadcast addresses of the individual macvlan devices correctly, if they are not all identical to vlan->lowerdev. However, this is already broken because there is no mechanism in place to update the individual multicast filters when you change the broadcast address. If someone cares enough they should fix this by collecting all broadcast addresses for a macvlan as we do for multicast and unicast. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-01macvlan: Fix potential use-after free for broadcastsHerbert Xu
When we postpone a broadcast packet we save the source port in the skb if it is local. However, the source port can disappear before we get a chance to process the packet. This patch fixes this by holding a ref count on the netdev. It also delays the skb->cb modification until after we allocate the new skb as you should not modify shared skbs. Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26macvlan: fix failure during registration v3Francesco Ruggeri
If macvlan_common_newlink fails in register_netdevice after macvlan_init then it decrements port->count twice, first in macvlan_uninit (from register_netdevice or rollback_registered) and then again in macvlan_common_newlink. A similar problem may exist in the ipvlan driver. This patch consolidates modifications to port->count into macvlan_init and macvlan_uninit (thanks to Eric Biederman for suggesting this approach). v3: remove macvtap specific bits. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-03-17vlan: propagate gso_max_segsEric Dumazet
vlan drivers lack proper propagation of gso_max_segs from lower device. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-25net: macvlan: use __ethtool_get_ksettingsDavid Decotigny
Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-02-17macvlan: convert to use IFF_NO_QUEUEZhang Shengju
Use IFF_NO_QUEUE to indicate that a device can run without a qdisc. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-29macvlan: make operstate and carrier more accurateNikolay Aleksandrov
Currently when a macvlan is being initialized and the lower device is netif_carrier_ok(), the macvlan device doesn't run through rfc2863_policy() and is left with UNKNOWN operstate. Fix it by adding an unconditional linkwatch event for the new macvlan device. Similar fix is already used by the 8021q device (see register_vlan_dev()). Also fix the inconsistent state when the lower device has been down and its carrier was changed (when a device is down NETDEV_CHANGE doesn't get generated). The second issue can be seen f.e. when we have a macvlan on top of a 8021q device which has been down and its real device has been changing carrier states, after setting the 8021q device up, the macvlan device will have the same carrier state as it was before even though the 8021q can now have a different state. Example for case 1: 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000 $ ip l add l eth2 macvl0 type macvlan $ ip l set macvl0 up $ ip l sh macvl0 72: macvl0@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether f6:0b:54:0a:9d:a3 brd ff:ff:ff:ff:ff:ff Example for case 2 (order is important): Prestate: eth2 UP/CARRIER, vlan1 down, vlan1-macvlan down $ ip l set vlan1-macvlan up $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff [ eth2 loses CARRIER before vlan1 has been UP-ed ] $ ip l sh eth2 4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff $ ip l set vlan1 up $ ip l sh vlan1 70: vlan1@eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:bf:57:16 brd ff:ff:ff:ff:ff:ff $ ip l sh vlan1-macvlan 71: vlan1-macvlan@vlan1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 4a:b8:44:56:b9:b9 brd ff:ff:ff:ff:ff:ff vlan1-macvlan is still UP, still has carrier and is still in the same operstate as before. After the patch in case 1 macvl0 has state UP as it should and in case 2 vlan1-macvlan has state LOWERLAYERDOWN again as it should. Note that while the lower macvlan device is down their carrier and thus operstate can go out of sync but that will be fixed once the lower device goes up again. This behaviour seems to have been present since beginning of git history. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUMTom Herbert
These netif flags are unnecessary convolutions. It is more straightforward to just use NETIF_F_HW_CSUM, NETIF_F_IP_CSUM, and NETIF_F_IPV6_CSUM directly. This patch also: - Cleans up can_checksum_protocol - Simplifies netdev_intersect_features Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-15net: Rename NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASKTom Herbert
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the set of features for offloading all checksums. This is a mask of the checksum offload related features bits. It is incorrect to set both NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for features of a device. This patch: - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where NETIF_F_ALL_CSUM is being used as a mask). - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-17macvlan: fix leak in macvlan_handle_frameSabrina Dubroca
Reset pskb in macvlan_handle_frame in case skb_share_check returned a clone. Fixes: 8a4eb5734e8d ("net: introduce rx_handler results and logic around that") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-10-12ipv4: Pass struct net into ip_defrag and ip_check_defragEric W. Biederman
The function ip_defrag is called on both the input and the output paths of the networking stack. In particular conntrack when it is tracking outbound packets from the local machine calls ip_defrag. So add a struct net parameter and stop making ip_defrag guess which network namespace it needs to defragment packets in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-03macvlan: Don't segment multiple tagged packets on macvlan deviceToshiaki Makita
Macvlan/macvtap devices don't need to segment multiple tagged packets since the lower devices can segment them. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-04macvlan: Propagate promiscuity setting to lower devices.Vlad Yasevich
When a macvlan device is placed in promiscuous mode, it currently just sets it's multicast mask to permissive, but doesn't change the state of the lower device. As a result, not all multicast traffic can be received on such device. Additionally, none of a vlan traffic can be received on such device as well. This patch propagates the promiscuous mode setting to lower device so that lower device may receive all packets that macvlan may be interested in. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-02macvlan: implement ndo_get_iflinkNicolas Dichtel
Don't use dev->iflink anymore. CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02net: Kill dev_rebuild_headerEric W. Biederman
Now that there are no more users kill dev_rebuild_header and all of it's implementations. This is long overdue. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-23macvlan: advertise link netns via netlinkNicolas Dichtel
Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09macvlan: play well with ipvlan deviceMahesh Bandewar
If device is already used as an ipvlan port then refuse to use it as a macvlan port at early stage of port creation. thost1:~# ip link add link eth0 ipvl0 type ipvlan thost1:~# echo $? 0 thost1:~# ip link add link eth0 mvl0 type macvlan RTNETLINK answers: Device or resource busy thost1:~# echo $? 2 thost1:~# Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09macvlan: allow setting LRO independently of lower deviceMichal Kubeček
Since commit fbe168ba91f7 ("net: generic dev_disable_lro() stacked device handling"), dev_disable_lro() zeroes NETIF_F_LRO feature flag first for a macvlan device and then for its lower device. As an attempt to set NETIF_F_LRO to zero is ignored, dev_disable_lro() issues a warning and taints kernel. Allowing NETIF_F_LRO to be set independently of the lower device consists of three parts: - add the flag to hw_features to allow toggling it - allow setting it to 0 even if lower device has the flag set - add the flag to MACVLAN_FEATURES to restore copying from lower device on macvlan creation Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>