summaryrefslogtreecommitdiffstats
path: root/include/net
AgeCommit message (Collapse)Author
2021-09-15sch_htb: Fix inconsistency when leaf qdisc creation failsMaxim Mikityanskiy
[ Upstream commit ca49bfd90a9dde175d2929dc1544b54841e33804 ] In HTB offload mode, qdiscs of leaf classes are grafted to netdev queues. sch_htb expects the dev_queue field of these qdiscs to point to the corresponding queues. However, qdisc creation may fail, and in that case noop_qdisc is used instead. Its dev_queue doesn't point to the right queue, so sch_htb can lose track of used netdev queues, which will cause internal inconsistencies. This commit fixes this bug by keeping track of the netdev queue inside struct htb_class. All reads of cl->leaf.q->dev_queue are replaced by the new field, the two values are synced on writes, and WARNs are added to assert equality of the two values. The driver API has changed: when TC_HTB_LEAF_DEL needs to move a queue, the driver used to pass the old and new queue IDs to sch_htb. Now that there is a new field (offload_queue) in struct htb_class that needs to be updated on this operation, the driver will pass the old class ID to sch_htb instead (it already knows the new class ID). Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20210826115425.1744053-1-maximmi@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15net: dsa: don't disable multicast flooding to the CPU even without an IGMP ↵Vladimir Oltean
querier [ Upstream commit c73c57081b3d59aa99093fbedced32ea02620cd3 ] Commit 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER attribute") added an option for users to turn off multicast flooding towards the CPU if they turn off the IGMP querier on a bridge which already has enslaved ports (echo 0 > /sys/class/net/br0/bridge/multicast_router). And commit a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags") simply papered over that issue, because it moved the decision to flood the CPU with multicast (or not) from the DSA core down to individual drivers, instead of taking a more radical position then. The truth is that disabling multicast flooding to the CPU is simply something we are not prepared to do now, if at all. Some reasons: - ICMP6 neighbor solicitation messages are unregistered multicast packets as far as the bridge is concerned. So if we stop flooding multicast, the outside world cannot ping the bridge device's IPv6 link-local address. - There might be foreign interfaces bridged with our DSA switch ports (sending a packet towards the host does not necessarily equal termination, but maybe software forwarding). So if there is no one interested in that multicast traffic in the local network stack, that doesn't mean nobody is. - PTP over L4 (IPv4, IPv6) is multicast, but is unregistered as far as the bridge is concerned. This should reach the CPU port. - The switch driver might not do FDB partitioning. And since we don't even bother to do more fine-grained flood disabling (such as "disable flooding _from_port_N_ towards the CPU port" as opposed to "disable flooding _from_any_port_ towards the CPU port"), this breaks standalone ports, or even multiple bridges where one has an IGMP querier and one doesn't. Reverting the logic makes all of the above work. Fixes: a8b659e7ff75 ("net: dsa: act as passthrough for bridge port flags") Fixes: 08cc83cc7fd8 ("net: dsa: add support for BRIDGE_MROUTER attribute") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-08-24ipv6: correct comments about fib6_node sernumzhang kai
correct comments in set and get fn_sernum Signed-off-by: zhang kai <zhangkaiheb@126.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-19Revert "flow_offload: action should not be NULL when it is referenced"Ido Schimmel
This reverts commit 9ea3e52c5bc8bb4a084938dc1e3160643438927a. Cited commit added a check to make sure 'action' is not NULL, but 'action' is already dereferenced before the check, when calling flow_offload_has_one_action(). Therefore, the check does not make any sense and results in a smatch warning: include/net/flow_offload.h:322 flow_action_mixed_hw_stats_check() warn: variable dereferenced before check 'action' (see line 319) Fix by reverting this commit. Cc: gushengxian <gushengxian@yulong.com> Fixes: 9ea3e52c5bc8 ("flow_offload: action should not be NULL when it is referenced") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20210819105842.1315705-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-09psample: Add a fwd declaration for skbuffRoi Dayan
Without this there is a warning if source files include psample.h before skbuff.h or doesn't include it at all. Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") Signed-off-by: Roi Dayan <roid@nvidia.com> Link: https://lore.kernel.org/r/20210808065242.1522535-1-roid@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Restrict range element expansion in ipset to avoid soft lockup, from Jozsef Kadlecsik. 2) Memleak in error path for nf_conntrack_bridge for IPv4 packets, from Yajun Deng. 3) Simplify conntrack garbage collection strategy to avoid frequent wake-ups, from Florian Westphal. 4) Fix NFNLA_HOOK_FUNCTION_NAME string, do not include module name. 5) Missing chain family netlink attribute in chain description in nfnetlink_hook. 6) Incorrect sequence number on nfnetlink_hook dumps. 7) Use netlink request family in reply message for consistency. 8) Remove offload_pickup sysctl, use conntrack for established state instead, from Florian Westphal. 9) Translate NFPROTO_INET/ingress to NFPROTO_NETDEV/ingress, since NFPROTO_INET is not exposed through nfnetlink_hook. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: nfnetlink_hook: translate inet ingress to netdev netfilter: conntrack: remove offload_pickup sysctl again netfilter: nfnetlink_hook: Use same family as request message netfilter: nfnetlink_hook: use the sequence number of the request message netfilter: nfnetlink_hook: missing chain family netfilter: nfnetlink_hook: strip off module name from hookfn netfilter: conntrack: collect all entries in one cycle netfilter: nf_conntrack_bridge: Fix memory leak when error netfilter: ipset: Limit the maximal range of consecutive elements to add/delete ==================== Link: https://lore.kernel.org/r/20210806151149.6356-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-06netfilter: conntrack: remove offload_pickup sysctl againFlorian Westphal
These two sysctls were added because the hardcoded defaults (2 minutes, tcp, 30 seconds, udp) turned out to be too low for some setups. They appeared in 5.14-rc1 so it should be fine to remove it again. Marcelo convinced me that there should be no difference between a flow that was offloaded vs. a flow that was not wrt. timeout handling. Thus the default is changed to those for TCP established and UDP stream, 5 days and 120 seconds, respectively. Marcelo also suggested to account for the timeout value used for the offloading, this avoids increase beyond the value in the conntrack-sysctl and will also instantly expire the conntrack entry with altered sysctls. Example: nf_conntrack_udp_timeout_stream=60 nf_flowtable_udp_timeout=60 This will remove offloaded udp flows after one minute, rather than two. An earlier version of this patch also cleared the ASSURED bit to allow nf_conntrack to evict the entry via early_drop (i.e., table full). However, it looks like we can safely assume that connection timed out via HW is still in established state, so this isn't needed. Quoting Oz: [..] the hardware sends all packets with a set FIN flags to sw. [..] Connections that are aged in hardware are expected to be in the established state. In case it turns out that back-to-sw-path transition can occur for 'dodgy' connections too (e.g., one side disappeared while software-path would have been in RETRANS timeout), we can adjust this later. Cc: Oz Shlomo <ozsh@nvidia.com> Cc: Paul Blakey <paulb@nvidia.com> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-05Merge tag 'net-5.14-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec. Current release - regressions: - sched: taprio: fix init procedure to avoid inf loop when dumping - sctp: move the active_key update after sh_keys is added Current release - new code bugs: - sparx5: fix build with old GCC & bitmask on 32-bit targets Previous releases - regressions: - xfrm: redo the PREEMPT_RT RCU vs hash_resize_mutex deadlock fix - xfrm: fixes for the compat netlink attribute translator - phy: micrel: Fix detection of ksz87xx switch Previous releases - always broken: - gro: set inner transport header offset in tcp/udp GRO hook to avoid crashes when such packets reach GSO - vsock: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST, as required by spec - dsa: sja1105: fix static FDB entries on SJA1105P/Q/R/S and SJA1110 - bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry - usb: lan78xx: don't modify phy_device state concurrently - usb: pegasus: check for errors of IO routines" * tag 'net-5.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (48 commits) net: vxge: fix use-after-free in vxge_device_unregister net: fec: fix use-after-free in fec_drv_remove net: pegasus: fix uninit-value in get_interrupt_interval net: ethernet: ti: am65-cpsw: fix crash in am65_cpsw_port_offload_fwd_mark_update() bnx2x: fix an error code in bnx2x_nic_load() net: wwan: iosm: fix recursive lock acquire in unregister net: wwan: iosm: correct data protocol mask bit net: wwan: iosm: endianness type correction net: wwan: iosm: fix lkp buildbot warning net: usb: lan78xx: don't modify phy_device state concurrently docs: networking: netdevsim rules net: usb: pegasus: Remove the changelog and DRIVER_VERSION. net: usb: pegasus: Check the return value of get_geristers() and friends; net/prestera: Fix devlink groups leakage in error flow net: sched: fix lockdep_set_class() typo error for sch->seqlock net: dsa: qca: ar9331: reorder MDIO write sequence VSOCK: handle VIRTIO_VSOCK_OP_CREDIT_REQUEST mptcp: drop unused rcu member in mptcp_pm_addr_entry net: ipv6: fix returned variable type in ip6_skb_dst_mtu nfp: update ethtool reporting of pauseframe control ...
2021-08-05Bluetooth: defer cleanup of resources in hci_unregister_dev()Tetsuo Handa
syzbot is hitting might_sleep() warning at hci_sock_dev_event() due to calling lock_sock() with rw spinlock held [1]. It seems that history of this locking problem is a trial and error. Commit b40df5743ee8 ("[PATCH] bluetooth: fix socket locking in hci_sock_dev_event()") in 2.6.21-rc4 changed bh_lock_sock() to lock_sock() as an attempt to fix lockdep warning. Then, commit 4ce61d1c7a8e ("[BLUETOOTH]: Fix locking in hci_sock_dev_event().") in 2.6.22-rc2 changed lock_sock() to local_bh_disable() + bh_lock_sock_nested() as an attempt to fix the sleep in atomic context warning. Then, commit 4b5dd696f81b ("Bluetooth: Remove local_bh_disable() from hci_sock.c") in 3.3-rc1 removed local_bh_disable(). Then, commit e305509e678b ("Bluetooth: use correct lock to prevent UAF of hdev object") in 5.13-rc5 again changed bh_lock_sock_nested() to lock_sock() as an attempt to fix CVE-2021-3573. This difficulty comes from current implementation that hci_sock_dev_event(HCI_DEV_UNREG) is responsible for dropping all references from sockets because hci_unregister_dev() immediately reclaims resources as soon as returning from hci_sock_dev_event(HCI_DEV_UNREG). But the history suggests that hci_sock_dev_event(HCI_DEV_UNREG) was not doing what it should do. Therefore, instead of trying to detach sockets from device, let's accept not detaching sockets from device at hci_sock_dev_event(HCI_DEV_UNREG), by moving actual cleanup of resources from hci_unregister_dev() to hci_cleanup_dev() which is called by bt_host_release() when all references to this unregistered device (which is a kobject) are gone. Since hci_sock_dev_event(HCI_DEV_UNREG) no longer resets hci_pi(sk)->hdev, we need to check whether this device was unregistered and return an error based on HCI_UNREGISTER flag. There might be subtle behavioral difference in "monitor the hdev" functionality; please report if you found something went wrong due to this patch. Link: https://syzkaller.appspot.com/bug?extid=a5df189917e79d5e59c9 [1] Reported-by: syzbot <syzbot+a5df189917e79d5e59c9@syzkaller.appspotmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: e305509e678b ("Bluetooth: use correct lock to prevent UAF of hdev object") Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-04Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2021-08-04 1) Fix a sysbot reported memory leak in xfrm_user_rcv_msg. From Pavel Skripkin. 2) Revert "xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype". This commit tried to fix a lockin bug, but only cured some of the symptoms. A proper fix is applied on top of this revert. 3) Fix a locking bug on xfrm state hash resize. A recent change on sequence counters accidentally repaced a spinlock by a mutex. Fix from Frederic Weisbecker. 4) Fix possible user-memory-access in xfrm_user_rcv_msg_compat(). From Dmitry Safonov. 5) Add initialiation sefltest fot xfrm_spdattr_type_t. From Dmitry Safonov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03net: ipv6: fix returned variable type in ip6_skb_dst_mtuAntoine Tenart
The patch fixing the returned value of ip6_skb_dst_mtu (int -> unsigned int) was rebased between its initial review and the version applied. In the meantime fade56410c22 was applied, which added a new variable (int) used as the returned value. This lead to a mismatch between the function prototype and the variable used as the return value. Fixes: 40fc3054b458 ("net: ipv6: fix return value of ip6_skb_dst_mtu") Cc: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03net: sched: provide missing kdoc for tcf_pkt_info and tcf_ematch_opsBijie Xu
Provide missing kdoc of fields of struct tcf_pkt_info and tcf_ematch_ops. Found using ./scripts/kernel-doc -none -Werror include/net/pkt_cls.h Signed-off-by: Bijie Xu <bijie.xu@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-03net: flow_offload: correct comments mismatch with codeBijie Xu
Correct mismatch between the name of flow_offload_has_one_action() and its kdoc entry. Found using ./scripts/kernel-doc -Werror -none include/net/flow_offload.h Signed-off-by: Bijie Xu <bijie.xu@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-27net: llc: fix skb_over_panicPavel Skripkin
Syzbot reported skb_over_panic() in llc_pdu_init_as_xid_cmd(). The problem was in wrong LCC header manipulations. Syzbot's reproducer tries to send XID packet. llc_ui_sendmsg() is doing following steps: 1. skb allocation with size = len + header size len is passed from userpace and header size is 3 since addr->sllc_xid is set. 2. skb_reserve() for header_len = 3 3. filling all other space with memcpy_from_msg() Ok, at this moment we have fully loaded skb, only headers needs to be filled. Then code comes to llc_sap_action_send_xid_c(). This function pushes 3 bytes for LLC PDU header and initializes it. Then comes llc_pdu_init_as_xid_cmd(). It initalizes next 3 bytes *AFTER* LLC PDU header and call skb_push(skb, 3). This looks wrong for 2 reasons: 1. Bytes rigth after LLC header are user data, so this function was overwriting payload. 2. skb_push(skb, 3) call can cause skb_over_panic() since all free space was filled in llc_ui_sendmsg(). (This can happen is user passed 686 len: 686 + 14 (eth header) + 3 (LLC header) = 703. SKB_DATA_ALIGN(703) = 704) So, in this patch I added 2 new private constansts: LLC_PDU_TYPE_U_XID and LLC_PDU_LEN_U_XID. LLC_PDU_LEN_U_XID is used to correctly reserve header size to handle LLC + XID case. LLC_PDU_TYPE_U_XID is used by llc_pdu_header_init() function to push 6 bytes instead of 3. And finally I removed skb_push() call from llc_pdu_init_as_xid_cmd(). This changes should not affect other parts of LLC, since after all steps we just transmit buffer. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+5e5a981ad7cc54c4b2b4@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25sctp: send pmtu probe only if packet loss in Search Complete stateXin Long
This patch is to introduce last_rtx_chunks into sctp_transport to detect if there's any packet retransmission/loss happened by checking against asoc's rtx_data_chunks in sctp_transport_pl_send(). If there is, namely, transport->last_rtx_chunks != asoc->rtx_data_chunks, the pmtu probe will be sent out. Otherwise, increment the pl.raise_count and return when it's in Search Complete state. With this patch, if in Search Complete state, which is a long period, it doesn't need to keep probing the current pmtu unless there's data packet loss. This will save quite some traffic. v1->v2: - add the missing Fixes tag. Fixes: 0dac127c0557 ("sctp: do black hole detection in search complete state") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-25sctp: improve the code for pmtu probe send and recv updateXin Long
This patch does 3 things: - make sctp_transport_pl_send() and sctp_transport_pl_recv() return bool type to decide if more probe is needed to send. - pr_debug() only when probe is really needed to send. - count pl.raise_count in sctp_transport_pl_send() instead of sctp_transport_pl_recv(), and it's only incremented for the 1st probe for the same size. These are preparations for the next patch to make probes happen only when there's packet loss in Search Complete state. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-20net/tcp_fastopen: remove obsolete externEric Dumazet
After cited commit, sysctl_tcp_fastopen_blackhole_timeout is no longer a global variable. Fixes: 3733be14a32b ("ipv4: Namespaceify tcp_fastopen_blackhole_timeout knob") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Wei Wang <weiwan@google.com> Link: https://lore.kernel.org/r/20210719092028.3016745-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-14Merge tag 'net-5.14-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski. "Including fixes from bpf and netfilter. Current release - regressions: - sock: fix parameter order in sock_setsockopt() Current release - new code bugs: - netfilter: nft_last: - fix incorrect arithmetic when restoring last used - honor NFTA_LAST_SET on restoration Previous releases - regressions: - udp: properly flush normal packet at GRO time - sfc: ensure correct number of XDP queues; don't allow enabling the feature if there isn't sufficient resources to Tx from any CPU - dsa: sja1105: fix address learning getting disabled on the CPU port - mptcp: addresses a rmem accounting issue that could keep packets in subflow receive buffers longer than necessary, delaying MPTCP-level ACKs - ip_tunnel: fix mtu calculation for ETHER tunnel devices - do not reuse skbs allocated from skbuff_fclone_cache in the napi skb cache, we'd try to return them to the wrong slab cache - tcp: consistently disable header prediction for mptcp Previous releases - always broken: - bpf: fix subprog poke descriptor tracking use-after-free - ipv6: - allocate enough headroom in ip6_finish_output2() in case iptables TEE is used - tcp: drop silly ICMPv6 packet too big messages to avoid expensive and pointless lookups (which may serve as a DDOS vector) - make sure fwmark is copied in SYNACK packets - fix 'disable_policy' for forwarded packets (align with IPv4) - netfilter: conntrack: - do not renew entry stuck in tcp SYN_SENT state - do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry - mptcp: cleanly handle error conditions with MP_JOIN and syncookies - mptcp: fix double free when rejecting a join due to port mismatch - validate lwtstate->data before returning from skb_tunnel_info() - tcp: call sk_wmem_schedule before sk_mem_charge in zerocopy path - mt76: mt7921: continue to probe driver when fw already downloaded - bonding: fix multiple issues with offloading IPsec to (thru?) bond - stmmac: ptp: fix issues around Qbv support and setting time back - bcmgenet: always clear wake-up based on energy detection Misc: - sctp: move 198 addresses from unusable to private scope - ptp: support virtual clocks and timestamping - openvswitch: optimize operation for key comparison" * tag 'net-5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (158 commits) net: dsa: properly check for the bridge_leave methods in dsa_switch_bridge_leave() sfc: add logs explaining XDP_TX/REDIRECT is not available sfc: ensure correct number of XDP queues sfc: fix lack of XDP TX queues - error XDP TX failed (-22) net: fddi: fix UAF in fza_probe net: dsa: sja1105: fix address learning getting disabled on the CPU port net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload net: Use nlmsg_unicast() instead of netlink_unicast() octeontx2-pf: Fix uninitialized boolean variable pps ipv6: allocate enough headroom in ip6_finish_output2() net: hdlc: rename 'mod_init' & 'mod_exit' functions to be module-specific net: bridge: multicast: fix MRD advertisement router port marking race net: bridge: multicast: fix PIM hello router port marking race net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340 dsa: fix for_each_child.cocci warnings virtio_net: check virtqueue_add_sgs() return value mptcp: properly account bulk freed memory selftests: mptcp: fix case multiple subflows limited by server mptcp: avoid processing packet if a subflow reset mptcp: fix syncookie process if mptcp can not_accept new subflow ...
2021-07-09mptcp: avoid processing packet if a subflow resetJianguo Wu
If check_fully_established() causes a subflow reset, it should not continue to process the packet in tcp_data_queue(). Add a return value to mptcp_incoming_options(), and return false if a subflow has been reset, else return true. Then drop the packet in tcp_data_queue()/tcp_rcv_state_process() if mptcp_incoming_options() return false. Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-09net: validate lwtstate->data before returning from skb_tunnel_info()Taehee Yoo
skb_tunnel_info() returns pointer of lwtstate->data as ip_tunnel_info type without validation. lwtstate->data can have various types such as mpls_iptunnel_encap, etc and these are not compatible. So skb_tunnel_info() should validate before returning that pointer. Splat looks like: BUG: KASAN: slab-out-of-bounds in vxlan_get_route+0x418/0x4b0 [vxlan] Read of size 2 at addr ffff888106ec2698 by task ping/811 CPU: 1 PID: 811 Comm: ping Not tainted 5.13.0+ #1195 Call Trace: dump_stack_lvl+0x56/0x7b print_address_description.constprop.8.cold.13+0x13/0x2ee ? vxlan_get_route+0x418/0x4b0 [vxlan] ? vxlan_get_route+0x418/0x4b0 [vxlan] kasan_report.cold.14+0x83/0xdf ? vxlan_get_route+0x418/0x4b0 [vxlan] vxlan_get_route+0x418/0x4b0 [vxlan] [ ... ] vxlan_xmit_one+0x148b/0x32b0 [vxlan] [ ... ] vxlan_xmit+0x25c5/0x4780 [vxlan] [ ... ] dev_hard_start_xmit+0x1ae/0x6e0 __dev_queue_xmit+0x1f39/0x31a0 [ ... ] neigh_xmit+0x2f9/0x940 mpls_xmit+0x911/0x1600 [mpls_iptunnel] lwtunnel_xmit+0x18f/0x450 ip_finish_output2+0x867/0x2040 [ ... ] Fixes: 61adedf3e3f1 ("route: move lwtunnel state to dst_entry") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Do not refresh timeout in SYN_SENT for syn retransmissions. Add selftest for unreplied TCP connection, from Florian Westphal. 2) Fix null dereference from error path with hardware offload in nftables. 3) Remove useless nf_ct_gre_keymap_flush() from netns exit path, from Vasily Averin. 4) Missing rcu read-lock side in ctnetlink helper info dump, also from Vasily. 5) Do not mark RST in the reply direction coming after SYN packet for an out-of-sync entry, from Ali Abdallah and Florian Westphal. 6) Add tcp_ignore_invalid_rst sysctl to allow to disable out of segment RSTs, from Ali. 7) KCSAN fix for nf_conntrack_all_lock(), from Manfred Spraul. 8) Honor NFTA_LAST_SET in nft_last. 9) Fix incorrect arithmetics when restore last_jiffies in nft_last. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-06bonding: Add struct bond_ipesc to manage SATaehee Yoo
bonding has been supporting ipsec offload. When SA is added, bonding just passes SA to its own active real interface. But it doesn't manage SA. So, when events(add/del real interface, active real interface change, etc) occur, bonding can't handle that well because It doesn't manage SA. So some problems(panic, UAF, refcnt leak)occur. In order to make it stable, it should manage SA. That's the reason why struct bond_ipsec is added. When a new SA is added to bonding interface, it is stored in the bond_ipsec list. And the SA is passed to a current active real interface. If events occur, it uses bond_ipsec data to handle these events. bond->ipsec_list is protected by bond->ipsec_lock. If a current active real interface is changed, the following logic works. 1. delete all SAs from old active real interface 2. Add all SAs to the new active real interface. 3. If a new active real interface doesn't support ipsec offload or SA's option, it sets real_dev to NULL. Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-06netfilter: conntrack: add new sysctl to disable RST checkAli Abdallah
This patch adds a new sysctl tcp_ignore_invalid_rst to disable marking out of segments RSTs as INVALID. Signed-off-by: Ali Abdallah <aabdallah@suse.de> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-07-03Merge branch 'work.iov_iter' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: "iov_iter cleanups and fixes. There are followups, but this is what had sat in -next this cycle. IMO the macro forest in there became much thinner and easier to follow..." * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits) csum_and_copy_to_pipe_iter(): leave handling of csum_state to caller clean up copy_mc_pipe_to_iter() pipe_zero(): we don't need no stinkin' kmap_atomic()... iov_iter: clean csum_and_copy_...() primitives up a bit copy_page_from_iter(): don't need kmap_atomic() for kvec/bvec cases copy_page_to_iter(): don't bother with kmap_atomic() for bvec/kvec cases iterate_xarray(): only of the first iteration we might get offset != 0 pull handling of ->iov_offset into iterate_{iovec,bvec,xarray} iov_iter: make iterator callbacks use base and len instead of iovec iov_iter: make the amount already copied available to iterator callbacks iov_iter: get rid of separate bvec and xarray callbacks iov_iter: teach iterate_{bvec,xarray}() about possible short copies iterate_bvec(): expand bvec.h macro forest, massage a bit iov_iter: unify iterate_iovec and iterate_kvec iov_iter: massage iterate_iovec and iterate_kvec to logics similar to iterate_bvec iterate_and_advance(): get rid of magic in case when n is 0 csum_and_copy_to_iter(): massage into form closer to csum_and_copy_from_iter() iov_iter: replace iov_iter_copy_from_user_atomic() with iterator-advancing variant [xarray] iov_iter_npages(): just use DIV_ROUND_UP() iov_iter_npages(): don't bother with iterate_all_kinds() ...
2021-07-02net: ipv6: fix return value of ip6_skb_dst_mtuVadim Fedorenko
Commit 628a5c561890 ("[INET]: Add IP(V6)_PMTUDISC_RPOBE") introduced ip6_skb_dst_mtu with return value of signed int which is inconsistent with actually returned values. Also 2 users of this function actually assign its value to unsigned int variable and only __xfrm6_output assigns result of this function to signed variable but actually uses as unsigned in further comparisons and calls. Change this function to return unsigned int value. Fixes: 628a5c561890 ("[INET]: Add IP(V6)_PMTUDISC_RPOBE") Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-02xfrm: Fix RCU vs hash_resize_mutex lock inversionFrederic Weisbecker
xfrm_bydst_resize() calls synchronize_rcu() while holding hash_resize_mutex. But then on PREEMPT_RT configurations, xfrm_policy_lookup_bytype() may acquire that mutex while running in an RCU read side critical section. This results in a deadlock. In fact the scope of hash_resize_mutex is way beyond the purpose of xfrm_policy_lookup_bytype() to just fetch a coherent and stable policy for a given destination/direction, along with other details. The lower level net->xfrm.xfrm_policy_lock, which among other things protects per destination/direction references to policy entries, is enough to serialize and benefit from priority inheritance against the write side. As a bonus, it makes it officially a per network namespace synchronization business where a policy table resize on namespace A shouldn't block a policy lookup on namespace B. Fixes: 77cc278f7b20 (xfrm: policy: Use sequence counters with associated lock) Cc: stable@vger.kernel.org Cc: Ahmed S. Darwish <a.darwish@linutronix.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Varad Gautam <varad.gautam@suse.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-07-02netfilter: conntrack: nf_ct_gre_keymap_flush() removalVasily Averin
nf_ct_gre_keymap_flush() is useless. It is called from nf_conntrack_cleanup_net_list() only and tries to remove nf_ct_gre_keymap entries from pernet gre keymap list. Though: a) at this point the list should already be empty, all its entries were deleted during the conntracks cleanup, because nf_conntrack_cleanup_net_list() executes nf_ct_iterate_cleanup(kill_all) before nf_conntrack_proto_pernet_fini(): nf_conntrack_cleanup_net_list +- nf_ct_iterate_cleanup | nf_ct_put | nf_conntrack_put | nf_conntrack_destroy | destroy_conntrack | destroy_gre_conntrack | nf_ct_gre_keymap_destroy `- nf_conntrack_proto_pernet_fini nf_ct_gre_keymap_flush b) Let's say we find that the keymap list is not empty. This means netns still has a conntrack associated with gre, in which case we should not free its memory, because this will lead to a double free and related crashes. However I doubt it could have gone unnoticed for years, obviously this does not happen in real life. So I think we can remove both nf_ct_gre_keymap_flush() and nf_conntrack_proto_pernet_fini(). Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-07-01tcp: consistently disable header prediction for mptcpPaolo Abeni
The MPTCP receive path is hooked only into the TCP slow-path. The DSS presence allows plain MPTCP traffic to hit that consistently. Since commit e1ff9e82e2ea ("net: mptcp: improve fallback to TCP"), when an MPTCP socket falls back to TCP, it can hit the TCP receive fast-path, and delay or stop triggering the event notification. Address the issue explicitly disabling the header prediction for MPTCP sockets. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/200 Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01net: remove the caif_hsi driverChristoph Hellwig
The caif_hsi driver relies on a cfhsi_get_ops symbol using symbol_get, but this symbol is not provided anywhere in the kernel tree. Remove this driver given that it is dead code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01net: sock: extend SO_TIMESTAMPING for PHC bindingYangbo Lu
Since PTP virtual clock support is added, there can be several PTP virtual clocks based on one PTP physical clock for timestamping. This patch is to extend SO_TIMESTAMPING API to support PHC (PTP Hardware Clock) binding by adding a new flag SOF_TIMESTAMPING_BIND_PHC. When PTP virtual clocks are in use, user space can configure to bind one for timestamping, but PTP physical clock is not supported and not needed to bind. This patch is preparation for timestamp conversion from raw timestamp to a specific PTP virtual clock time in core net. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01sctp: move 198 addresses from unusable to private scopeXin Long
The doc draft-stewart-tsvwg-sctp-ipv4-00 that restricts 198 addresses was never published. These addresses as private addresses should be allowed to use in SCTP. As Michael Tuexen suggested, this patch is to move 198 addresses from unusable to private scope. Reported-by: Sérgio <surkamp@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-07-01net: annotate data race around sk_ll_usecEric Dumazet
sk_ll_usec is read locklessly from sk_can_busy_loop() while another thread can change its value in sock_setsockopt() This is correct but needs annotations. BUG: KCSAN: data-race in __skb_try_recv_datagram / sock_setsockopt write to 0xffff88814eb5f904 of 4 bytes by task 14011 on cpu 0: sock_setsockopt+0x1287/0x2090 net/core/sock.c:1175 __sys_setsockopt+0x14f/0x200 net/socket.c:2100 __do_sys_setsockopt net/socket.c:2115 [inline] __se_sys_setsockopt net/socket.c:2112 [inline] __x64_sys_setsockopt+0x62/0x70 net/socket.c:2112 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88814eb5f904 of 4 bytes by task 14001 on cpu 1: sk_can_busy_loop include/net/busy_poll.h:41 [inline] __skb_try_recv_datagram+0x14f/0x320 net/core/datagram.c:273 unix_dgram_recvmsg+0x14c/0x870 net/unix/af_unix.c:2101 unix_seqpacket_recvmsg+0x5a/0x70 net/unix/af_unix.c:2067 ____sys_recvmsg+0x15d/0x310 include/linux/uio.h:244 ___sys_recvmsg net/socket.c:2598 [inline] do_recvmmsg+0x35c/0x9f0 net/socket.c:2692 __sys_recvmmsg net/socket.c:2771 [inline] __do_sys_recvmmsg net/socket.c:2794 [inline] __se_sys_recvmmsg net/socket.c:2787 [inline] __x64_sys_recvmmsg+0xcf/0x150 net/socket.c:2787 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0x00000101 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 14001 Comm: syz-executor.3 Not tainted 5.13.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29tcp: change ICSK_CA_PRIV_SIZE definitionEric Dumazet
Instead of a magic number (13 currently) and having to change it every other year, use sizeof_field() macro. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: sock: introduce sk_error_reportAlexander Aring
This patch introduces a function wrapper to call the sk_error_report callback. That will prepare to add additional handling whenever sk_error_report is called, for example to trace socket errors. Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the FDB addresses at the cross-chip notifier levelVladimir Oltean
The same concerns expressed for host MDB entries are valid for host FDBs just as well: - in the case of multiple bridges spanning the same switch chip, deleting a host FDB entry that belongs to one bridge will result in breakage to the other bridge - not deleting FDB entries across DSA links means that the switch's hardware tables will eventually run out, given enough wear&tear So do the same thing and introduce reference counting for CPU ports and DSA links using the same data structures as we have for MDB entries. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: reference count the MDB entries at the cross-chip notifier levelVladimir Oltean
Ever since the cross-chip notifiers were introduced, the design was meant to be simplistic and just get the job done without worrying too much about dangling resources left behind. For example, somebody installs an MDB entry on sw0p0 in this daisy chain topology. It gets installed using ds->ops->port_mdb_add() on sw0p0, sw1p4 and sw2p4. | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ x ] Then the same person deletes that MDB entry. The cross-chip notifier for deletion only matches sw0p0: | sw0p0 sw0p1 sw0p2 sw0p3 sw0p4 [ user ] [ user ] [ user ] [ dsa ] [ cpu ] [ x ] [ ] [ ] [ ] [ ] | +---------+ | sw1p0 sw1p1 sw1p2 sw1p3 sw1p4 [ user ] [ user ] [ user ] [ dsa ] [ dsa ] [ ] [ ] [ ] [ ] [ ] | +---------+ | sw2p0 sw2p1 sw2p2 sw2p3 sw2p4 [ user ] [ user ] [ user ] [ user ] [ dsa ] [ ] [ ] [ ] [ ] [ ] Why? Because the DSA links are 'trunk' ports, if we just go ahead and delete the MDB from sw1p4 and sw2p4 directly, we might delete those multicast entries when they are still needed. Just consider the fact that somebody does: - add a multicast MAC address towards sw0p0 [ via the cross-chip notifiers it gets installed on the DSA links too ] - add the same multicast MAC address towards sw0p1 (another port of that same switch) - delete the same multicast MAC address from sw0p0. At this point, if we deleted the MAC address from the DSA links, it would be flooded, even though there is still an entry on switch 0 which needs it not to. So that is why deletions only match the targeted source port and nothing on DSA links. Of course, dangling resources means that the hardware tables will eventually run out given enough additions/removals, but hey, at least it's simple. But there is a bigger concern which needs to be addressed, and that is our support for SWITCHDEV_OBJ_ID_HOST_MDB. DSA simply translates such an object into a dsa_port_host_mdb_add() which ends up as ds->ops->port_mdb_add() on the upstream port, and a similar thing happens on deletion: dsa_port_host_mdb_del() will trigger ds->ops->port_mdb_del() on the upstream port. When there are 2 VLAN-unaware bridges spanning the same switch (which is a use case DSA proudly supports), each bridge will install its own SWITCHDEV_OBJ_ID_HOST_MDB entries. But upon deletion, DSA goes ahead and emits a DSA_NOTIFIER_MDB_DEL for dp->cpu_dp, which is shared between the user ports enslaved to br0 and the user ports enslaved to br1. Not good. The host-trapped multicast addresses installed by br1 will be deleted when any state changes in br0 (IGMP timers expire, or ports leave, etc). To avoid this, we could of course go the route of the zero-sum game and delete the DSA_NOTIFIER_MDB_DEL call for dp->cpu_dp. But the better design is to just admit that on shared ports like DSA links and CPU ports, we should be reference counting calls, even if this consumes some dynamic memory which DSA has traditionally avoided. On the flip side, the hardware tables of switches are limited in size, so it would be good if the OS managed them properly instead of having them eventually overflow. To address the memory usage concern, we only apply the refcounting of MDB entries on ports that are really shared (CPU ports and DSA links) and not on user ports. In a typical single-switch setup, this means only the CPU port (and the host MDB entries are not that many, really). The name of the newly introduced data structures (dsa_mac_addr) is chosen in such a way that will be reusable for host FDB entries (next patch). With this change, we can finally have the same matching logic for the MDB additions and deletions, as well as for their host-trapped variants. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29net: dsa: introduce dsa_is_upstream_port and dsa_switch_is_upstream_ofVladimir Oltean
In preparation for the new cross-chip notifiers for host addresses, let's introduce some more topology helpers which we are going to use to discern switches that are in our path towards the dedicated CPU port from switches that aren't. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'for-net-next-2021-06-28' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - Add support for QCA_ROME device (0cf3:e500) and RTL8822CE - Update management interface revision to 21 - Use of incluse language - Proper handling of HCI_LE_Advertising_Set_Terminated event - Recovery handing of HCI ncmd=0 - Various memory fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28sctp: validate from_addr_param returnMarcelo Ricardo Leitner
Ilja reported that, simply putting it, nothing was validating that from_addr_param functions were operating on initialized memory. That is, the parameter itself was being validated by sctp_walk_params, but it doesn't check for types and their specific sizes and it could be a 0-length one, causing from_addr_param to potentially work over the next parameter or even uninitialized memory. The fix here is to, in all calls to from_addr_param, check if enough space is there for the wanted IP address type. Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messagesAndreas Roeseler
This patch builds off of commit 2b246b2569cd2ac6ff700d0dce56b8bae29b1842 and adds functionality to respond to ICMPV6 PROBE requests. Add icmp_build_probe function to construct PROBE requests for both ICMPV4 and ICMPV6. Modify icmpv6_rcv to detect ICMPV6 PROBE messages and call the icmpv6_echo_reply handler. Modify icmpv6_echo_reply to build a PROBE response message based on the queried interface. This patch has been tested using a branch of the iputils git repo which can be found here: https://github.com/Juniper-Clinic-2020/iputils/tree/probe-request Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28flow_offload: action should not be NULL when it is referencedgushengxian
"action" should not be NULL when it is referenced. Signed-off-by: gushengxian <13145886936@163.com> Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: switchdev: add a context void pointer to struct switchdev_notifier_infoVladimir Oltean
In the case where the driver asks for a replay of a certain type of event (port object or attribute) for a bridge port that is a LAG, it may do so because this port has just joined the LAG. But there might already be other switchdev ports in that LAG, and it is preferable that those preexisting switchdev ports do not act upon the replayed event. The solution is to add a context to switchdev events, which is NULL most of the time (when the bridge layer initiates the call) but which can be set to a value controlled by the switchdev driver when a replay is requested. The driver can then check the context to figure out if all ports within the LAG should act upon the switchdev event, or just the ones that match the context. We have to modify all switchdev_handle_* helper functions as well as the prototypes in the drivers that use these helpers too, because these helpers hide the underlying struct switchdev_notifier_info from us and there is no way to retrieve the context otherwise. The context structure will be populated and used in later patches. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/gitDavid S. Miller
/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2021-06-28 1) Remove an unneeded error assignment in esp4_gro_receive(). From Yang Li. 2) Add a new byseq state hashtable to find acquire states faster. From Sabrina Dubroca. 3) Remove some unnecessary variables in pfkey_create(). From zuoqilin. 4) Remove the unused description from xfrm_type struct. From Florian Westphal. 5) Fix a spelling mistake in the comment of xfrm_state_ok(). From gushengxian. 6) Replace hdr_off indirections by a small helper function. From Florian Westphal. 7) Remove xfrm4_output_finish and xfrm6_output_finish declarations, they are not used anymore.From Antony Antony. 8) Remove xfrm replay indirections. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28Merge tag 'mac80211-next-for-net-next-2021-06-25' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes berg says: ==================== Lots of changes: * aggregation handling improvements for some drivers * hidden AP discovery on 6 GHz and other HE 6 GHz improvements * minstrel improvements for no-ack frames * deferred rate control for TXQs to improve reaction times * virtual time-based airtime scheduler * along with various little cleanups/fixups ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-28net: lwtunnel: handle MTU calculation in forwadingVadim Fedorenko
Commit 14972cbd34ff ("net: lwtunnel: Handle fragmentation") moved fragmentation logic away from lwtunnel by carry encap headroom and use it in output MTU calculation. But the forwarding part was not covered and created difference in MTU for output and forwarding and further to silent drops on ipv4 forwarding path. Fix it by taking into account lwtunnel encap headroom. The same commit also introduced difference in how to treat RTAX_MTU in IPv4 and IPv6 where latter explicitly removes lwtunnel encap headroom from route MTU. Make IPv4 version do the same. Fixes: 14972cbd34ff ("net: lwtunnel: Handle fragmentation") Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-26Bluetooth: Fix Set Extended (Scan Response) DataLuiz Augusto von Dentz
These command do have variable length and the length can go up to 251, so this changes the struct to not use a fixed size and then when creating the PDU only the actual length of the data send to the controller. Fixes: a0fb3726ba551 ("Bluetooth: Use Set ext adv/scan rsp data if controller supports") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-06-26Bluetooth: use inclusive language when filtering devicesArchie Pusaka
This patch replaces some non-inclusive terms based on the appropriate language mapping table compiled by the Bluetooth SIG: https://specificationrefs.bluetooth.com/language-mapping/Appropriate_Language_Mapping_Table.pdf Specifically, these terms are replaced: blacklist -> reject list whitelist -> accept list Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-06-26Bluetooth: use inclusive language when tracking connectionsArchie Pusaka
This patch replaces some non-inclusive terms based on the appropriate language mapping table compiled by the Bluetooth SIG: https://specificationrefs.bluetooth.com/language-mapping/Appropriate_Language_Mapping_Table.pdf Specifically, these terms are replaced: master -> central slave -> peripheral Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>