summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2020-01-31net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookupSabrina Dubroca
commit 6c8991f41546c3c472503dff1ea9daaddf9331c2 upstream. ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net: ipv6: add net argument to ip6_dst_lookup_flowSabrina Dubroca
commit c4e85f73afb6384123e5ef1bba3315b2e3ad031e upstream. This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit 343d60aada5a ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument"). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31tipc: fix ordering of tipc module init and exit routineTaehee Yoo
commit 9cf1cd8ee3ee09ef2859017df2058e2f53c5347f upstream. In order to set/get/dump, the tipc uses the generic netlink infrastructure. So, when tipc module is inserted, init function calls genl_register_family(). After genl_register_family(), set/get/dump commands are immediately allowed and these callbacks internally use the net_generic. net_generic is allocated by register_pernet_device() but this is called after genl_register_family() in the __init function. So, these callbacks would use un-initialized net_generic. Test commands: #SHELL1 while : do modprobe tipc modprobe -rv tipc done #SHELL2 while : do tipc link list done Splat looks like: [ 59.616322][ T2788] kasan: CONFIG_KASAN_INLINE enabled [ 59.617234][ T2788] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 59.618398][ T2788] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 59.619389][ T2788] CPU: 3 PID: 2788 Comm: tipc Not tainted 5.4.0+ #194 [ 59.620231][ T2788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 59.621428][ T2788] RIP: 0010:tipc_bcast_get_broadcast_mode+0x131/0x310 [tipc] [ 59.622379][ T2788] Code: c7 c6 ef 8b 38 c0 65 ff 0d 84 83 c9 3f e8 d7 a5 f2 e3 48 8d bb 38 11 00 00 48 b8 00 00 00 00 [ 59.622550][ T2780] NET: Registered protocol family 30 [ 59.624627][ T2788] RSP: 0018:ffff88804b09f578 EFLAGS: 00010202 [ 59.624630][ T2788] RAX: dffffc0000000000 RBX: 0000000000000011 RCX: 000000008bc66907 [ 59.624631][ T2788] RDX: 0000000000000229 RSI: 000000004b3cf4cc RDI: 0000000000001149 [ 59.624633][ T2788] RBP: ffff88804b09f588 R08: 0000000000000003 R09: fffffbfff4fb3df1 [ 59.624635][ T2788] R10: fffffbfff50318f8 R11: ffff888066cadc18 R12: ffffffffa6cc2f40 [ 59.624637][ T2788] R13: 1ffff11009613eba R14: ffff8880662e9328 R15: ffff8880662e9328 [ 59.624639][ T2788] FS: 00007f57d8f7b740(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000 [ 59.624645][ T2788] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 59.625875][ T2780] tipc: Started in single node mode [ 59.626128][ T2788] CR2: 00007f57d887a8c0 CR3: 000000004b140002 CR4: 00000000000606e0 [ 59.633991][ T2788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 59.635195][ T2788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 59.636478][ T2788] Call Trace: [ 59.637025][ T2788] tipc_nl_add_bc_link+0x179/0x1470 [tipc] [ 59.638219][ T2788] ? lock_downgrade+0x6e0/0x6e0 [ 59.638923][ T2788] ? __tipc_nl_add_link+0xf90/0xf90 [tipc] [ 59.639533][ T2788] ? tipc_nl_node_dump_link+0x318/0xa50 [tipc] [ 59.640160][ T2788] ? mutex_lock_io_nested+0x1380/0x1380 [ 59.640746][ T2788] tipc_nl_node_dump_link+0x4fd/0xa50 [tipc] [ 59.641356][ T2788] ? tipc_nl_node_reset_link_stats+0x340/0x340 [tipc] [ 59.642088][ T2788] ? __skb_ext_del+0x270/0x270 [ 59.642594][ T2788] genl_lock_dumpit+0x85/0xb0 [ 59.643050][ T2788] netlink_dump+0x49c/0xed0 [ 59.643529][ T2788] ? __netlink_sendskb+0xc0/0xc0 [ 59.644044][ T2788] ? __netlink_dump_start+0x190/0x800 [ 59.644617][ T2788] ? __mutex_unlock_slowpath+0xd0/0x670 [ 59.645177][ T2788] __netlink_dump_start+0x5a0/0x800 [ 59.645692][ T2788] genl_rcv_msg+0xa75/0xe90 [ 59.646144][ T2788] ? __lock_acquire+0xdfe/0x3de0 [ 59.646692][ T2788] ? genl_family_rcv_msg_attrs_parse+0x320/0x320 [ 59.647340][ T2788] ? genl_lock_dumpit+0xb0/0xb0 [ 59.647821][ T2788] ? genl_unlock+0x20/0x20 [ 59.648290][ T2788] ? genl_parallel_done+0xe0/0xe0 [ 59.648787][ T2788] ? find_held_lock+0x39/0x1d0 [ 59.649276][ T2788] ? genl_rcv+0x15/0x40 [ 59.649722][ T2788] ? lock_contended+0xcd0/0xcd0 [ 59.650296][ T2788] netlink_rcv_skb+0x121/0x350 [ 59.650828][ T2788] ? genl_family_rcv_msg_attrs_parse+0x320/0x320 [ 59.651491][ T2788] ? netlink_ack+0x940/0x940 [ 59.651953][ T2788] ? lock_acquire+0x164/0x3b0 [ 59.652449][ T2788] genl_rcv+0x24/0x40 [ 59.652841][ T2788] netlink_unicast+0x421/0x600 [ ... ] Fixes: 7e4369057806 ("tipc: fix a slab object leak") Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31tcp: md5: fix potential overestimation of TCP option spaceEric Dumazet
commit 9424e2e7ad93ffffa88f882c9bc5023570904b55 upstream. Back in 2008, Adam Langley fixed the corner case of packets for flows having all of the following options : MD5 TS SACK Since MD5 needs 20 bytes, and TS needs 12 bytes, no sack block can be cooked from the remaining 8 bytes. tcp_established_options() correctly sets opts->num_sack_blocks to zero, but returns 36 instead of 32. This means TCP cooks packets with 4 extra bytes at the end of options, containing unitialized bytes. Fixes: 33ad798c924b ("tcp: options clean up") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31openvswitch: support asymmetric conntrackAaron Conole
commit 5d50aa83e2c8e91ced2cca77c198b468ca9210f4 upstream. The openvswitch module shares a common conntrack and NAT infrastructure exposed via netfilter. It's possible that a packet needs both SNAT and DNAT manipulation, due to e.g. tuple collision. Netfilter can support this because it runs through the NAT table twice - once on ingress and again after egress. The openvswitch module doesn't have such capability. Like netfilter hook infrastructure, we should run through NAT twice to keep the symmetry. Fixes: 05752523e565 ("openvswitch: Interface with NAT.") Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net/tls: Fix return values to avoid ENOTSUPPValentin Vidic
commit 4a5cdc604b9cf645e6fa24d8d9f055955c3c8516 upstream. ENOTSUPP is not available in userspace, for example: setsockopt failed, 524, Unknown error 524 Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> [PG: drop tools/selftest part; doesn't use ENOTSUPP in v5.2.x] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net: sched: fix dump qlen for sch_mq/sch_mqprio with NOLOCK subqueuesDust Li
commit 2f23cd42e19c22c24ff0e221089b7b6123b117c5 upstream. sch->q.len hasn't been set if the subqueue is a NOLOCK qdisc in mq_dump() and mqprio_dump(). Fixes: ce679e8df7ed ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net: dsa: fix flow dissection on Tx pathAlexander Lobakin
commit 8bef0af09a5415df761b04fa487a6c34acae74bc upstream. Commit 43e665287f93 ("net-next: dsa: fix flow dissection") added an ability to override protocol and network offset during flow dissection for DSA-enabled devices (i.e. controllers shipped as switch CPU ports) in order to fix skb hashing for RPS on Rx path. However, skb_hash() and added part of code can be invoked not only on Rx, but also on Tx path if we have a multi-queued device and: - kernel is running on UP system or - XPS is not configured. The call stack in this two cases will be like: dev_queue_xmit() -> __dev_queue_xmit() -> netdev_core_pick_tx() -> netdev_pick_tx() -> skb_tx_hash() -> skb_get_hash(). The problem is that skbs queued for Tx have both network offset and correct protocol already set up even after inserting a CPU tag by DSA tagger, so calling tag_ops->flow_dissect() on this path actually only breaks flow dissection and hashing. This can be observed by adding debug prints just before and right after tag_ops->flow_dissect() call to the related block of code: Before the patch: Rx path (RPS): [ 19.240001] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 19.244271] tag_ops->flow_dissect() [ 19.247811] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */ [ 19.215435] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 19.219746] tag_ops->flow_dissect() [ 19.223241] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */ [ 18.654057] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 18.658332] tag_ops->flow_dissect() [ 18.661826] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */ Tx path (UP system): [ 18.759560] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */ [ 18.763933] tag_ops->flow_dissect() [ 18.767485] Tx: proto: 0x920b, nhoff: 34 /* junk */ [ 22.800020] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */ [ 22.804392] tag_ops->flow_dissect() [ 22.807921] Tx: proto: 0x920b, nhoff: 34 /* junk */ [ 16.898342] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */ [ 16.902705] tag_ops->flow_dissect() [ 16.906227] Tx: proto: 0x920b, nhoff: 34 /* junk */ After: Rx path (RPS): [ 16.520993] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 16.525260] tag_ops->flow_dissect() [ 16.528808] Rx: proto: 0x0800, nhoff: 8 /* ETH_P_IP */ [ 15.484807] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 15.490417] tag_ops->flow_dissect() [ 15.495223] Rx: proto: 0x0806, nhoff: 8 /* ETH_P_ARP */ [ 17.134621] Rx: proto: 0x00f8, nhoff: 0 /* ETH_P_XDSA */ [ 17.138895] tag_ops->flow_dissect() [ 17.142388] Rx: proto: 0x8100, nhoff: 8 /* ETH_P_8021Q */ Tx path (UP system): [ 15.499558] Tx: proto: 0x0800, nhoff: 26 /* ETH_P_IP */ [ 20.664689] Tx: proto: 0x0806, nhoff: 26 /* ETH_P_ARP */ [ 18.565782] Tx: proto: 0x86dd, nhoff: 26 /* ETH_P_IPV6 */ In order to fix that we can add the check 'proto == htons(ETH_P_XDSA)' to prevent code from calling tag_ops->flow_dissect() on Tx. I also decided to initialize 'offset' variable so tagger callbacks can now safely leave it untouched without provoking a chaos. Fixes: 43e665287f93 ("net-next: dsa: fix flow dissection") Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net: bridge: deny dev_set_mac_address() when unregisteringNikolay Aleksandrov
commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 upstream. We have an interesting memory leak in the bridge when it is being unregistered and is a slave to a master device which would change the mac of its slaves on unregister (e.g. bond, team). This is a very unusual setup but we do end up leaking 1 fdb entry because dev_set_mac_address() would cause the bridge to insert the new mac address into its table after all fdbs are flushed, i.e. after dellink() on the bridge has finished and we call NETDEV_UNREGISTER the bond/team would release it and will call dev_set_mac_address() to restore its original address and that in turn will add an fdb in the bridge. One fix is to check for the bridge dev's reg_state in its ndo_set_mac_address callback and return an error if the bridge is not in NETREG_REGISTERED. Easy steps to reproduce: 1. add bond in mode != A/B 2. add any slave to the bond 3. add bridge dev as a slave to the bond 4. destroy the bridge device Trace: unreferenced object 0xffff888035c4d080 (size 128): comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s) hex dump (first 32 bytes): 41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00 A..6............ d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00 ...^?........... backtrace: [<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f [<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge] [<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge] [<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge] [<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge] [<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge] [<000000006846a77f>] dev_set_mac_address+0x63/0x9b [<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding] [<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding] [<00000000305d7795>] notifier_call_chain+0x38/0x56 [<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23 [<000000008279477b>] rollback_registered_many+0x353/0x6a4 [<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f [<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43 [<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a [<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268 Fixes: 43598813386f ("bridge: add local MAC address to forwarding table (v2)") Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31mqprio: Fix out-of-bounds access in mqprio_dumpVladyslav Tarasiuk
commit 9f104c7736904ac72385bbb48669e0c923ca879b upstream. When user runs a command like tc qdisc add dev eth1 root mqprio KASAN stack-out-of-bounds warning is emitted. Currently, NLA_ALIGN macro used in mqprio_dump provides too large buffer size as argument for nla_put and memcpy down the call stack. The flow looks like this: 1. nla_put expects exact object size as an argument; 2. Later it provides this size to memcpy; 3. To calculate correct padding for SKB, nla_put applies NLA_ALIGN macro itself. Therefore, NLA_ALIGN should not be applied to the nla_put parameter. Otherwise it will lead to out-of-bounds memory access in memcpy. Fixes: 4e8b86c06269 ("mqprio: Introduce new hardware offload mode and shaper in mqprio") Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31inet: protect against too small mtu values.Eric Dumazet
commit 501a90c945103e8627406763dac418f20f3837b2 upstream. syzbot was once again able to crash a host by setting a very small mtu on loopback device. Let's make inetdev_valid_mtu() available in include/net/ip.h, and use it in ip_setup_cork(), so that we protect both ip_append_page() and __ip_append_data() Also add a READ_ONCE() when the device mtu is read. Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(), even if other code paths might write over this field. Add a big comment in include/linux/netdevice.h about dev->mtu needing READ_ONCE()/WRITE_ONCE() annotations. Hopefully we will add the missing ones in followup patches. [1] refcount_t: saturated; leaking memory. WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 panic+0x2e3/0x75c kernel/panic.c:221 __warn.cold+0x2f/0x3e kernel/panic.c:582 report_bug+0x289/0x300 lib/bug.c:195 fixup_bug arch/x86/kernel/traps.c:174 [inline] fixup_bug arch/x86/kernel/traps.c:169 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027 RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22 Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89 RSP: 0018:ffff88809689f550 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1 R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001 R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40 refcount_add include/linux/refcount.h:193 [inline] skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999 sock_wmalloc+0xf1/0x120 net/core/sock.c:2096 ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383 udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276 inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821 kernel_sendpage+0x92/0xf0 net/socket.c:3794 sock_sendpage+0x8b/0xc0 net/socket.c:936 pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458 splice_from_pipe_feed fs/splice.c:512 [inline] __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636 splice_from_pipe+0x108/0x170 fs/splice.c:671 generic_splice_sendpage+0x3c/0x50 fs/splice.c:842 do_splice_from fs/splice.c:861 [inline] direct_splice_actor+0x123/0x190 fs/splice.c:1035 splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990 do_splice_direct+0x1da/0x2a0 fs/splice.c:1078 do_sendfile+0x597/0xd00 fs/read_write.c:1464 __do_sys_sendfile64 fs/read_write.c:1525 [inline] __se_sys_sendfile64 fs/read_write.c:1511 [inline] __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x441409 Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409 RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005 RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010 R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180 R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000 Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 1470ddf7f8ce ("inet: Remove explicit write references to sk/inet in ip_append_data") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()Eric Dumazet
commit 2dd5616ecdcebdf5a8d007af64e040d4e9214efe upstream. Use the new tcf_proto_check_kind() helper to make sure user provided value is well formed. BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:606 [inline] BUG: KMSAN: uninit-value in string+0x4be/0x600 lib/vsprintf.c:668 CPU: 0 PID: 12358 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108 __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245 string_nocheck lib/vsprintf.c:606 [inline] string+0x4be/0x600 lib/vsprintf.c:668 vsnprintf+0x218f/0x3210 lib/vsprintf.c:2510 __request_module+0x2b1/0x11c0 kernel/kmod.c:143 tcf_proto_lookup_ops+0x171/0x700 net/sched/cls_api.c:139 tc_chain_tmplt_add net/sched/cls_api.c:2730 [inline] tc_ctl_chain+0x1904/0x38a0 net/sched/cls_api.c:2850 rtnetlink_rcv_msg+0x115a/0x1580 net/core/rtnetlink.c:5224 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5242 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x110f/0x1330 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311 __sys_sendmsg net/socket.c:2356 [inline] __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2363 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45a649 Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f0790795c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 000000000045a649 RDX: 0000000000000000 RSI: 0000000020000300 RDI: 0000000000000006 RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f07907966d4 R13: 00000000004c8db5 R14: 00000000004df630 R15: 00000000ffffffff Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline] kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132 kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86 slab_alloc_node mm/slub.c:2773 [inline] __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x306/0xa10 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline] netlink_sendmsg+0x783/0x1330 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] ___sys_sendmsg+0x14ff/0x1590 net/socket.c:2311 __sys_sendmsg net/socket.c:2356 [inline] __do_sys_sendmsg net/socket.c:2365 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2363 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2363 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 6f96c3c6904c ("net_sched: fix backward compatibility for TCA_KIND") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31SUNRPC: Fix another issue with MIC buffer spaceChuck Lever
commit e8d70b321ecc9b23d09b8df63e38a2f73160c209 upstream. xdr_shrink_pagelen() BUG's when @len is larger than buf->page_len. This can happen when xdr_buf_read_mic() is given an xdr_buf with a small page array (like, only a few bytes). Instead, just cap the number of bytes that xdr_shrink_pagelen() will move. Fixes: 5f1bc39979d ("SUNRPC: Fix buffer handling of GSS MIC ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-31mac80211: Do not send Layer 2 Update frame before authorizationJouni Malinen
commit 3e493173b7841259a08c5c8e5cbe90adb349da7e upstream. The Layer 2 Update frame is used to update bridges when a station roams to another AP even if that STA does not transmit any frames after the reassociation. This behavior was described in IEEE Std 802.11F-2003 as something that would happen based on MLME-ASSOCIATE.indication, i.e., before completing 4-way handshake. However, this IEEE trial-use recommended practice document was published before RSN (IEEE Std 802.11i-2004) and as such, did not consider RSN use cases. Furthermore, IEEE Std 802.11F-2003 was withdrawn in 2006 and as such, has not been maintained amd should not be used anymore. Sending out the Layer 2 Update frame immediately after association is fine for open networks (and also when using SAE, FT protocol, or FILS authentication when the station is actually authenticated by the time association completes). However, it is not appropriate for cases where RSN is used with PSK or EAP authentication since the station is actually fully authenticated only once the 4-way handshake completes after authentication and attackers might be able to use the unauthenticated triggering of Layer 2 Update frame transmission to disrupt bridge behavior. Fix this by postponing transmission of the Layer 2 Update frame from station entry addition to the point when the station entry is marked authorized. Similarly, send out the VLAN binding update only if the STA entry has already been authorized. Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19xfrm: release device reference for invalid stateXiaodong Xu
commit 4944a4b1077f74d89073624bd286219d2fcbfce3 upstream. An ESP packet could be decrypted in async mode if the input handler for this packet returns -EINPROGRESS in xfrm_input(). At this moment the device reference in skb is held. Later xfrm_input() will be invoked again to resume the processing. If the transform state is still valid it would continue to release the device reference and there won't be a problem; however if the transform state is not valid when async resumption happens, the packet will be dropped while the device reference is still being held. When the device is deleted for some reason and the reference to this device is not properly released, the kernel will keep logging like: unregister_netdevice: waiting for ppp2 to become free. Usage count = 1 The issue is observed when running IPsec traffic over a PPPoE device based on a bridge interface. By terminating the PPPoE connection on the server end for multiple times, the PPPoE device on the client side will eventually get stuck on the above warning message. This patch will check the async mode first and continue to release device reference in async resumption, before it is dropped due to invalid state. v2: Do not assign address family from outer_mode in the transform if the state is invalid v3: Release device reference in the error path instead of jumping to resume Fixes: 4ce3dbe397d7b ("xfrm: Fix xfrm_input() to verify state is valid when (encap_type < 0)") Signed-off-by: Xiaodong Xu <stid.smth@gmail.com> Reported-by: Bo Chen <chenborfc@163.com> Tested-by: Bo Chen <chenborfc@163.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19tipc: fix link name length checkJohn Rutherford
commit fd567ac20cb0377ff466d3337e6e9ac5d0cb15e4 upstream. In commit 4f07b80c9733 ("tipc: check msg->req data len in tipc_nl_compat_bearer_disable") the same patch code was copied into routines: tipc_nl_compat_bearer_disable(), tipc_nl_compat_link_stat_dump() and tipc_nl_compat_link_reset_stats(). The two link routine occurrences should have been modified to check the maximum link name length and not bearer name length. Fixes: 4f07b80c9733 ("tipc: check msg->reg data len in tipc_nl_compat_bearer_disable") Signed-off-by: John Rutherford <john.rutherford@dektech.com.au> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net/tls: use sg_next() to walk sg entriesJakub Kicinski
commit c5daa6cccdc2f94aca2c9b3fa5f94e4469997293 upstream. Partially sent record cleanup path increments an SG entry directly instead of using sg_next(). This should not be a problem today, as encrypted messages should be always allocated as arrays. But given this is a cleanup path it's easy to miss was this ever to change. Use sg_next(), and simplify the code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net/tls: remove the dead inplace_crypto codeJakub Kicinski
commit 9e5ffed37df68d0ccfb2fdc528609e23a1e70ebe upstream. Looks like when BPF support was added by commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") and commit d829e9c4112b ("tls: convert to generic sk_msg interface") it broke/removed the support for in-place crypto as added by commit 4e6d47206c32 ("tls: Add support for inplace records encryption"). The inplace_crypto member of struct tls_rec is dead, inited to zero, and sometimes set to zero again. It used to be set to 1 when record was allocated, but the skmsg code doesn't seem to have been written with the idea of in-place crypto in mind. Since non trivial effort is required to bring the feature back and we don't really have the HW to measure the benefit just remove the left over support for now to avoid confusing readers. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net: skmsg: fix TLS 1.3 crash with full sk_msgJakub Kicinski
commit 031097d9e079e40dce401031d1012e83d80eaf01 upstream. TLS 1.3 started using the entry at the end of the SG array for chaining-in the single byte content type entry. This mostly works: [ E E E E E E . . ] ^ ^ start end E < content type / [ E E E E E E C . ] ^ ^ start end (Where E denotes a populated SG entry; C denotes a chaining entry.) If the array is full, however, the end will point to the start: [ E E E E E E E E ] ^ start end And we end up overwriting the start: E < content type / [ C E E E E E E E ] ^ start end The sg array is supposed to be a circular buffer with start and end markers pointing anywhere. In case where start > end (i.e. the circular buffer has "wrapped") there is an extra entry reserved at the end to chain the two halves together. [ E E E E E E . . l ] (Where l is the reserved entry for "looping" back to front. As suggested by John, let's reserve another entry for chaining SG entries after the main circular buffer. Note that this entry has to be pointed to by the end entry so its position is not fixed. Examples of full messages: [ E E E E E E E E . l ] ^ ^ start end <---------------. [ E E . E E E E E E l ] ^ ^ end start Now the end will always point to an unused entry, so TLS 1.3 can always use it. Fixes: 130b392c6cd6 ("net: tls: Add tls 1.3 support") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net: sockmap: use bitmap for copy infoJakub Kicinski
commit 163ab96b52ae2bb2d8f188cd29f0b570610f9007 upstream. Don't use bool array in struct sk_msg_sg, save 12 bytes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net/tls: free the record on encryption errorJakub Kicinski
commit d10523d0b3d78153ee58d19853ced26c9004c8c4 upstream. When tls_do_encryption() fails the SG lists are left with the SG_END and SG_CHAIN marks in place. One could hope that once encryption fails we will never see the record again, but that is in fact not true. Commit d3b18ad31f93 ("tls: add bpf support to sk_msg handling") added special handling to ENOMEM and ENOSPC errors which mean we may see the same record re-submitted. As suggested by John free the record, the BPF code is already doing just that. Reported-by: syzbot+df0d4ec12332661dd1f9@syzkaller.appspotmail.com Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net/tls: take into account that bpf_exec_tx_verdict() may free the recordJakub Kicinski
commit c329ef9684de9517d82af5b4758c9e1b64a8a11a upstream. bpf_exec_tx_verdict() may free the record if tls_push_record() fails, or if the entire record got consumed by BPF. Re-check ctx->open_rec before touching the data. Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19openvswitch: remove another BUG_ON()Paolo Abeni
commit 8a574f86652a4540a2433946ba826ccb87f398cc upstream. If we can't build the flow del notification, we can simply delete the flow, no need to crash the kernel. Still keep a WARN_ON to preserve debuggability. Note: the BUG_ON() predates the Fixes tag, but this change can be applied only after the mentioned commit. v1 -> v2: - do not leak an skb on error Fixes: aed067783e50 ("openvswitch: Minimize ovs_flow_cmd_del critical section.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()Paolo Abeni
commit 8ffeb03fbba3b599690b361467bfd2373e8c450f upstream. All the callers of ovs_flow_cmd_build_info() already deal with error return code correctly, so we can handle the error condition in a more gracefull way. Still dump a warning to preserve debuggability. v1 -> v2: - clarify the commit message - clean the skb and report the error (DaveM) Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19sctp: cache netns in sctp_ep_commonXin Long
commit 312434617cb16be5166316cf9d08ba760b1042a1 upstream. This patch is to fix a data-race reported by syzbot: BUG: KCSAN: data-race in sctp_assoc_migrate / sctp_hash_obj write to 0xffff8880b67c0020 of 8 bytes by task 18908 on cpu 1: sctp_assoc_migrate+0x1a6/0x290 net/sctp/associola.c:1091 sctp_sock_migrate+0x8aa/0x9b0 net/sctp/socket.c:9465 sctp_accept+0x3c8/0x470 net/sctp/socket.c:4916 inet_accept+0x7f/0x360 net/ipv4/af_inet.c:734 __sys_accept4+0x224/0x430 net/socket.c:1754 __do_sys_accept net/socket.c:1795 [inline] __se_sys_accept net/socket.c:1792 [inline] __x64_sys_accept+0x4e/0x60 net/socket.c:1792 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff8880b67c0020 of 8 bytes by task 12003 on cpu 0: sctp_hash_obj+0x4f/0x2d0 net/sctp/input.c:894 rht_key_get_hash include/linux/rhashtable.h:133 [inline] rht_key_hashfn include/linux/rhashtable.h:159 [inline] rht_head_hashfn include/linux/rhashtable.h:174 [inline] head_hashfn lib/rhashtable.c:41 [inline] rhashtable_rehash_one lib/rhashtable.c:245 [inline] rhashtable_rehash_chain lib/rhashtable.c:276 [inline] rhashtable_rehash_table lib/rhashtable.c:316 [inline] rht_deferred_worker+0x468/0xab0 lib/rhashtable.c:420 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 It was caused by rhashtable access asoc->base.sk when sctp_assoc_migrate is changing its value. However, what rhashtable wants is netns from asoc base.sk, and for an asoc, its netns won't change once set. So we can simply fix it by caching netns since created. Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable") Reported-by: syzbot+e3b35fe7918ff0ee474e@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19sctp: Fix memory leak in sctp_sf_do_5_2_4_dupcookNavid Emamdoost
commit b6631c6031c746ed004c4221ec0616d7a520f441 upstream. In the implementation of sctp_sf_do_5_2_4_dupcook() the allocated new_asoc is leaked if security_sctp_assoc_request() fails. Release it via sctp_association_free(). Fixes: 2277c7cd75e3 ("sctp: Add LSM hooks") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19openvswitch: fix flow command message sizePaolo Abeni
commit 4e81c0b3fa93d07653e2415fa71656b080a112fd upstream. When user-space sets the OVS_UFID_F_OMIT_* flags, and the relevant flow has no UFID, we can exceed the computed size, as ovs_nla_put_identifier() will always dump an OVS_FLOW_ATTR_KEY attribute. Take the above in account when computing the flow command message size. Fixes: 74ed7ab9264c ("openvswitch: Add support for unique flow IDs.") Reported-by: Qi Jun Ding <qding@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net: sched: fix `tc -s class show` no bstats on class with nolock subqueuesDust Li
commit 14e54ab9143fa60794d13ea0a66c792a2046a8f3 upstream. When a classful qdisc's child qdisc has set the flag TCQ_F_CPUSTATS (pfifo_fast for example), the child qdisc's cpu_bstats should be passed to gnet_stats_copy_basic(), but many classful qdisc didn't do that. As a result, `tc -s class show dev DEV` always return 0 for bytes and packets in this case. Pass the child qdisc's cpu_bstats to gnet_stats_copy_basic() to fix this issue. The qstats also has this problem, but it has been fixed in 5dd431b6b9 ("net: sched: introduce and use qstats read...") and bstats still remains buggy. Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19net: psample: fix skb_over_panicNikolay Aleksandrov
commit 7eb9d7675c08937cd11d32b0b40442d4d731c5ee upstream. We need to calculate the skb size correctly otherwise we risk triggering skb_over_panic[1]. The issue is that data_len is added to the skb in a nl attribute, but we don't account for its header size (nlattr 4 bytes) and alignment. We account for it when calculating the total size in the > PSAMPLE_MAX_PACKET_SIZE comparison correctly, but not when allocating after that. The fix is simple - use nla_total_size() for data_len when allocating. To reproduce: $ tc qdisc add dev eth1 clsact $ tc filter add dev eth1 egress matchall action sample rate 1 group 1 trunc 129 $ mausezahn eth1 -b bcast -a rand -c 1 -p 129 < skb_over_panic BUG(), tail is 4 bytes past skb->end > [1] Trace: [ 50.459526][ T3480] skbuff: skb_over_panic: text:(____ptrval____) len:196 put:136 head:(____ptrval____) data:(____ptrval____) tail:0xc4 end:0xc0 dev:<NULL> [ 50.474339][ T3480] ------------[ cut here ]------------ [ 50.481132][ T3480] kernel BUG at net/core/skbuff.c:108! [ 50.486059][ T3480] invalid opcode: 0000 [#1] PREEMPT SMP [ 50.489463][ T3480] CPU: 3 PID: 3480 Comm: mausezahn Not tainted 5.4.0-rc7 #108 [ 50.492844][ T3480] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 [ 50.496551][ T3480] RIP: 0010:skb_panic+0x79/0x7b [ 50.498261][ T3480] Code: bc 00 00 00 41 57 4c 89 e6 48 c7 c7 90 29 9a 83 4c 8b 8b c0 00 00 00 50 8b 83 b8 00 00 00 50 ff b3 c8 00 00 00 e8 ae ef c0 fe <0f> 0b e8 2f df c8 fe 48 8b 55 08 44 89 f6 4c 89 e7 48 c7 c1 a0 22 [ 50.504111][ T3480] RSP: 0018:ffffc90000447a10 EFLAGS: 00010282 [ 50.505835][ T3480] RAX: 0000000000000087 RBX: ffff888039317d00 RCX: 0000000000000000 [ 50.507900][ T3480] RDX: 0000000000000000 RSI: ffffffff812716e1 RDI: 00000000ffffffff [ 50.509820][ T3480] RBP: ffffc90000447a60 R08: 0000000000000001 R09: 0000000000000000 [ 50.511735][ T3480] R10: ffffffff81d4f940 R11: 0000000000000000 R12: ffffffff834a22b0 [ 50.513494][ T3480] R13: ffffffff82c10433 R14: 0000000000000088 R15: ffffffff838a8084 [ 50.515222][ T3480] FS: 00007f3536462700(0000) GS:ffff88803eac0000(0000) knlGS:0000000000000000 [ 50.517135][ T3480] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 50.518583][ T3480] CR2: 0000000000442008 CR3: 000000003b222000 CR4: 00000000000006e0 [ 50.520723][ T3480] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 50.522709][ T3480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 50.524450][ T3480] Call Trace: [ 50.525214][ T3480] skb_put.cold+0x1b/0x1b [ 50.526171][ T3480] psample_sample_packet+0x1d3/0x340 [ 50.527307][ T3480] tcf_sample_act+0x178/0x250 [ 50.528339][ T3480] tcf_action_exec+0xb1/0x190 [ 50.529354][ T3480] mall_classify+0x67/0x90 [ 50.530332][ T3480] tcf_classify+0x72/0x160 [ 50.531286][ T3480] __dev_queue_xmit+0x3db/0xd50 [ 50.532327][ T3480] dev_queue_xmit+0x18/0x20 [ 50.533299][ T3480] packet_sendmsg+0xee7/0x2090 [ 50.534331][ T3480] sock_sendmsg+0x54/0x70 [ 50.535271][ T3480] __sys_sendto+0x148/0x1f0 [ 50.536252][ T3480] ? tomoyo_file_ioctl+0x23/0x30 [ 50.537334][ T3480] ? ksys_ioctl+0x5e/0xb0 [ 50.540068][ T3480] __x64_sys_sendto+0x2a/0x30 [ 50.542810][ T3480] do_syscall_64+0x73/0x1f0 [ 50.545383][ T3480] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 50.548477][ T3480] RIP: 0033:0x7f35357d6fb3 [ 50.551020][ T3480] Code: 48 8b 0d 18 90 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d f9 d3 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 eb f6 ff ff 48 89 04 24 [ 50.558547][ T3480] RSP: 002b:00007ffe0c7212c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 50.561870][ T3480] RAX: ffffffffffffffda RBX: 0000000001dac010 RCX: 00007f35357d6fb3 [ 50.565142][ T3480] RDX: 0000000000000082 RSI: 0000000001dac2a2 RDI: 0000000000000003 [ 50.568469][ T3480] RBP: 00007ffe0c7212f0 R08: 00007ffe0c7212d0 R09: 0000000000000014 [ 50.571731][ T3480] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000082 [ 50.574961][ T3480] R13: 0000000001dac2a2 R14: 0000000000000001 R15: 0000000000000003 [ 50.578170][ T3480] Modules linked in: sch_ingress virtio_net [ 50.580976][ T3480] ---[ end trace 61a515626a595af6 ]--- CC: Yotam Gigi <yotamg@mellanox.com> CC: Jiri Pirko <jiri@mellanox.com> CC: Jamal Hadi Salim <jhs@mojatatu.com> CC: Simon Horman <simon.horman@netronome.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 6ae0a6286171 ("net: Introduce psample, a new genetlink channel for packet sampling") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19xfrm: Fix memleak on xfrm state destroySteffen Klassert
commit 86c6739eda7d2a03f2db30cbee67a5fb81afa8ba upstream. We leak the page that we use to create skb page fragments when destroying the xfrm_state. Fix this by dropping a page reference if a page was assigned to the xfrm_state. Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible") Reported-by: JD <jdtxs00@gmail.com> Reported-by: Paul Wouters <paul@nohats.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19mac80211: fix station inactive_time shortly after bootAhmed Zaki
commit 285531f9e6774e3be71da6673d475ff1a088d675 upstream. In the first 5 minutes after boot (time of INITIAL_JIFFIES), ieee80211_sta_last_active() returns zero if last_ack is zero. This leads to "inactive time" showing jiffies_to_msecs(jiffies). # iw wlan0 station get fc:ec:da:64:a6:dd Station fc:ec:da:64:a6:dd (on wlan0) inactive time: 4294894049 ms . . connected time: 70 seconds Fix by returning last_rx if last_ack == 0. Signed-off-by: Ahmed Zaki <anzaki@gmail.com> Link: https://lore.kernel.org/r/20191031121243.27694-1-anzaki@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19bridge: ebtables: don't crash when using dnat target in output chainsFlorian Westphal
commit b23c0742c2ce7e33ed79d10e451f70fdb5ca85d1 upstream. xt_in() returns NULL in the output hook, skip the pkt_type change for that case, redirection only makes sense in broute/prerouting hooks. Reported-by: Tom Yan <tom.ty89@gmail.com> Cc: Linus Lüssing <linus.luessing@c0d3.blue> Fixes: cf3cb246e277d ("bridge: ebtables: fix reception of frames DNAT-ed to bridge device/port") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19netfilter: ipset: Fix nla_policies to fully support NL_VALIDATE_STRICTJozsef Kadlecsik
commit 1289975643f4cdecb071dc641059a47679fd170f upstream. Since v5.2 (commit "netlink: re-add parse/validate functions in strict mode") NL_VALIDATE_STRICT is enabled. Fix the ipset nla_policies which did not support strict mode and convert from deprecated parsings to verified ones. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-19netfilter: ipset: Fix the last missing check of nla_parse_deprecated()Jozsef Kadlecsik
commit 13c6ba1f855415cf3b9c58ea926ae8858050ec1c upstream. In dump_init() the outdated comment was incorrect and we had a missing validation check of nla_parse_deprecated(). Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09vhost/vsock: split packets to send using multiple buffersStefano Garzarella
commit 6dbd3e66e7785a2f055bf84d98de9b8fd31ff3f5 upstream. If the packets to sent to the guest are bigger than the buffer available, we can split them, using multiple buffers and fixing the length in the packet header. This is safe since virtio-vsock supports only stream sockets. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09taprio: don't reject same mqprio settingsIvan Khoronzhuk
commit b5a0faa3572ac70bd374bd66190ac3ad4fddab20 upstream. The taprio qdisc allows to set mqprio setting but only once. In case if mqprio settings are provided next time the error is returned as it's not allowed to change traffic class mapping in-flignt and that is normal. But if configuration is absolutely the same - no need to return error. It allows to provide same command couple times, changing only base time for instance, or changing only scheds maps, but leaving mqprio setting w/o modification. It more corresponds the message: "Changing the traffic mapping of a running schedule is not supported", so reject mqprio if it's really changed. Also corrected TC_BITMASK + 1 for consistency, as proposed. Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09ipv6/route: return if there is no fib_nh_gw_familyHangbin Liu
commit 004b39427f945696db30abb2c4e1a3856ffff819 upstream. Previously we will return directly if (!rt || !rt->fib6_nh.fib_nh_gw_family) in function rt6_probe(), but after commit cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh"), the logic changed to return if there is fib_nh_gw_family. Fixes: cc3a86c802f0 ("ipv6: Change rt6_probe to take a fib6_nh") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net/tls: enable sk_msg redirect to tls socket egressWillem de Bruijn
commit d4ffb02dee2fcb20e0c8086a8d1305bf885820bb upstream. Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket with TLS_TX takes the following path: tcp_bpf_sendmsg_redir tcp_bpf_push_locked tcp_bpf_push kernel_sendpage_locked sock->ops->sendpage_locked Also update the flags test in tls_sw_sendpage_locked to allow flag MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this. Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t Link: https://github.com/wdebruij/kerneltools/commits/icept.2 Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net: sched: ensure opts_len <= IP_TUNNEL_OPTS_MAX in act_tunnel_keyXin Long
commit 4f0e97d070984d487df027f163e52bb72d1713d8 upstream. info->options_len is 'u8' type, and when opts_len with a value > IP_TUNNEL_OPTS_MAX, 'info->options_len = opts_len' will cast int to u8 and set a wrong value to info->options_len. Kernel crashed in my test when doing: # opts="0102:80:00800022" # for i in {1..99}; do opts="$opts,0102:80:00800022"; done # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev eth0 ingress # tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 ip_proto udp action tunnel_key \ set src_ip 10.0.99.192 dst_ip 10.0.99.193 \ dst_port 6081 id 11 geneve_opts $opts \ action mirred egress redirect dev geneve0 So we should do the similar check as cls_flower does, return error when opts_len > IP_TUNNEL_OPTS_MAX in tunnel_key_copy_opts(). Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net/sched: act_pedit: fix WARN() in the traffic pathDavide Caratti
commit f67169fef8dbcc1ac6a6a109ecaad0d3b259002c upstream. when configuring act_pedit rules, the number of keys is validated only on addition of a new entry. This is not sufficient to avoid hitting a WARN() in the traffic path: for example, it is possible to replace a valid entry with a new one having 0 extended keys, thus causing splats in dmesg like: pedit BUG: index 42 WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit] [...] RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit] Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9 RSP: 0018:ffff888077c9f320 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209 R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780 R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000 FS: 00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0 Call Trace: tcf_action_exec+0x105/0x3f0 tcf_classify+0xf2/0x410 __dev_queue_xmit+0xcbf/0x2ae0 ip_finish_output2+0x711/0x1fb0 ip_output+0x1bf/0x4b0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x180c/0x2430 sock_sendmsg+0xdb/0x110 __sys_sendto+0x257/0x2b0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fdeeb72e993 Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24 RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993 RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003 RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080 Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that attempts to install rules having 0 keys are always rejected with -EINVAL. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net: rtnetlink: prevent underflows in do_setvfinfo()Dan Carpenter
commit ff08ddba3a55caadd0ae531975b06b407d008ae7 upstream. The "ivm->vf" variable is a u32, but the problem is that a number of drivers cast it to an int and then forget to check for negatives. An example of this is in the cxgb4 driver. drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 2890 static int cxgb4_mgmt_get_vf_config(struct net_device *dev, 2891 int vf, struct ifla_vf_info *ivi) ^^^^^^ 2892 { 2893 struct port_info *pi = netdev_priv(dev); 2894 struct adapter *adap = pi->adapter; 2895 struct vf_info *vfinfo; 2896 2897 if (vf >= adap->num_vfs) ^^^^^^^^^^^^^^^^^^^ 2898 return -EINVAL; 2899 vfinfo = &adap->vfinfo[vf]; ^^^^^^^^^^^^^^^^^^^^^^^^^^ There are 48 functions affected. drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8435 hclge_set_vf_vlan_filter() warn: can 'vfid' underflow 's32min-2147483646' drivers/net/ethernet/freescale/enetc/enetc_pf.c:377 enetc_pf_set_vf_mac() warn: can 'vf' underflow 's32min-2147483646' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2899 cxgb4_mgmt_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:2960 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3019 cxgb4_mgmt_set_vf_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3038 cxgb4_mgmt_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:3086 cxgb4_mgmt_set_vf_link_state() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/chelsio/cxgb/cxgb2.c:791 get_eeprom() warn: can 'i' underflow 's32min-(-4),0,4-s32max' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:82 bnxt_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:164 bnxt_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:186 bnxt_get_vf_config() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:228 bnxt_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:264 bnxt_set_vf_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:293 bnxt_set_vf_bw() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c:333 bnxt_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:2595 bnx2x_vf_op_prep() warn: can 'vfidx' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2281 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2285 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2286 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2292 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/broadcom/bnx2x/bnx2x_vfpf.c:2297 bnx2x_post_vf_bulletin() warn: can 'vf' underflow 's32min-63' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1832 qlcnic_sriov_set_vf_mac() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1864 qlcnic_sriov_set_vf_tx_rate() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:1937 qlcnic_sriov_set_vf_vlan() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2005 qlcnic_sriov_get_vf_config() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c:2036 qlcnic_sriov_set_vf_spoofchk() warn: can 'vf' underflow 's32min-254' drivers/net/ethernet/emulex/benet/be_main.c:1914 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1915 be_get_vf_config() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1922 be_set_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:1951 be_clear_vf_tvt() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2063 be_set_vf_tx_rate() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/emulex/benet/be_main.c:2091 be_set_vf_link_state() warn: can 'vf' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:2609 ice_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3050 ice_get_vf_cfg() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3103 ice_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3181 ice_set_vf_mac() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3237 ice_set_vf_trust() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c:3286 ice_set_vf_link_state() warn: can 'vf_id' underflow 's32min-65534' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3919 i40e_validate_vf() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:3957 i40e_ndo_set_vf_mac() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4104 i40e_ndo_set_vf_port_vlan() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4263 i40e_ndo_set_vf_bw() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4309 i40e_ndo_get_vf_config() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4371 i40e_ndo_set_vf_link_state() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4441 i40e_ndo_set_vf_spoofchk() warn: can 'vf_id' underflow 's32min-2147483646' drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4504 i40e_ndo_set_vf_trust() warn: can 'vf_id' underflow 's32min-2147483646' Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09net/smc: fix fastopen for non-blocking connect()Ursula Braun
commit 8204df72bea1a7d83d0777add6da98a41dfbdc34 upstream. FASTOPEN does not work with SMC-sockets. Since SMC allows fallback to TCP native during connection start, the FASTOPEN setsockopts trigger this fallback, if the SMC-socket is still in state SMC_INIT. But if a FASTOPEN setsockopt is called after a non-blocking connect(), this is broken, and fallback does not make sense. This change complements commit cd2063604ea6 ("net/smc: avoid fallback in case of non-blocking connect") and fixes the syzbot reported problem "WARNING in smc_unhash_sk". Reported-by: syzbot+8488cc4cf1c9e09b8b86@syzkaller.appspotmail.com Fixes: e1bbdd570474 ("net/smc: reduce sock_put() for fallback sockets") Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2020-01-09ipmr: Fix skb headroom in ipmr_get_route().Guillaume Nault
commit 7901cd97963d6cbde88fa25a4a446db3554c16c6 upstream. In route.c, inet_rtm_getroute_build_skb() creates an skb with no headroom. This skb is then used by inet_rtm_getroute() which may pass it to rt_fill_info() and, from there, to ipmr_get_route(). The later might try to reuse this skb by cloning it and prepending an IPv4 header. But since the original skb has no headroom, skb_push() triggers skb_under_panic(): skbuff: skb_under_panic: text:00000000ca46ad8a len:80 put:20 head:00000000cd28494e data:000000009366fd6b tail:0x3c end:0xec0 dev:veth0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:108! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 6 PID: 587 Comm: ip Not tainted 5.4.0-rc6+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014 RIP: 0010:skb_panic+0xbf/0xd0 Code: 41 a2 ff 8b 4b 70 4c 8b 4d d0 48 c7 c7 20 76 f5 8b 44 8b 45 bc 48 8b 55 c0 48 8b 75 c8 41 54 41 57 41 56 41 55 e8 75 dc 7a ff <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RSP: 0018:ffff888059ddf0b0 EFLAGS: 00010286 RAX: 0000000000000086 RBX: ffff888060a315c0 RCX: ffffffff8abe4822 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88806c9a79cc RBP: ffff888059ddf118 R08: ffffed100d9361b1 R09: ffffed100d9361b0 R10: ffff88805c68aee3 R11: ffffed100d9361b1 R12: ffff88805d218000 R13: ffff88805c689fec R14: 000000000000003c R15: 0000000000000ec0 FS: 00007f6af184b700(0000) GS:ffff88806c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc8204a000 CR3: 0000000057b40006 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_push+0x7e/0x80 ipmr_get_route+0x459/0x6fa rt_fill_info+0x692/0x9f0 inet_rtm_getroute+0xd26/0xf20 rtnetlink_rcv_msg+0x45d/0x630 netlink_rcv_skb+0x1a5/0x220 rtnetlink_rcv+0x15/0x20 netlink_unicast+0x305/0x3a0 netlink_sendmsg+0x575/0x730 sock_sendmsg+0xb5/0xc0 ___sys_sendmsg+0x497/0x4f0 __sys_sendmsg+0xcb/0x150 __x64_sys_sendmsg+0x48/0x50 do_syscall_64+0xd2/0xac0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Actually the original skb used to have enough headroom, but the reserve_skb() call was lost with the introduction of inet_rtm_getroute_build_skb() by commit 404eb77ea766 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE"). We could reserve some headroom again in inet_rtm_getroute_build_skb(), but this function shouldn't be responsible for handling the special case of ipmr_get_route(). Let's handle that directly in ipmr_get_route() by calling skb_realloc_headroom() instead of skb_clone(). Fixes: 404eb77ea766 ("ipv4: support sport, dport and ip_proto in RTM_GETROUTE") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29netfilter: ipset: Copy the right MAC address in hash:ip,mac IPv6 setsStefano Brivio
commit 97664bc2c77e2b65cdedddcae2643fc93291d958 upstream. Same as commit 1b4a75108d5b ("netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets"), another copy and paste went wrong in commit 8cc4ccf58379 ("netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets"). When I fixed this for IPv4 in 1b4a75108d5b, I didn't realise that hash:ip,mac sets also support IPv6 as family, and this is covered by a separate function, hash_ipmac6_kadt(). In hash:ip,mac sets, the first dimension is the IP address, and the second dimension is the MAC address: check the IPSET_DIM_TWO_SRC flag in flags while deciding which MAC address to copy, destination or source. This way, mixing source and destination matches for the two dimensions of ip,mac hash type works as expected, also for IPv6. With this setup: ip netns add A ip link add veth1 type veth peer name veth2 netns A ip addr add 2001:db8::1/64 dev veth1 ip -net A addr add 2001:db8::2/64 dev veth2 ip link set veth1 up ip -net A link set veth2 up dst=$(ip netns exec A cat /sys/class/net/veth2/address) ip netns exec A ipset create test_hash hash:ip,mac family inet6 ip netns exec A ipset add test_hash 2001:db8::1,${dst} ip netns exec A ip6tables -A INPUT -p icmpv6 --icmpv6-type 135 -j ACCEPT ip netns exec A ip6tables -A INPUT -m set ! --match-set test_hash src,dst -j DROP ipset now correctly matches a test packet: # ping -c1 2001:db8::2 >/dev/null # echo $? 0 Reported-by: Chen, Yi <yiche@redhat.com> Fixes: 8cc4ccf58379 ("netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29SUNRPC: The RDMA back channel mustn't disappear while requests are outstandingTrond Myklebust
commit 9edb455e6797bb50aa38ef71e62668966065ede8 upstream. If there are RDMA back channel requests being processed by the server threads, then we should hold a reference to the transport to ensure it doesn't get freed from underneath us. Reported-by: Neil Brown <neilb@suse.de> Fixes: 63cae47005af ("xprtrdma: Handle incoming backward direction RPC calls") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29SUNRPC: The TCP back channel mustn't disappear while requests are outstandingTrond Myklebust
commit 875f0706accd6501c3209bb99df8573171fb5d75 upstream. If there are TCP back channel requests being processed by the server threads, then we should hold a reference to the transport to ensure it doesn't get freed from underneath us. Reported-by: Neil Brown <neilb@suse.de> Fixes: 2ea24497a1b3 ("SUNRPC: RPC callbacks may be split across several..") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29ipvs: move old_secure_tcp into struct netns_ipvsEric Dumazet
commit c24b75e0f9239e78105f81c5f03a751641eb07ef upstream. syzbot reported the following issue : BUG: KCSAN: data-race in update_defense_level / update_defense_level read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1: update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0: update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events defense_work_handler Indeed, old_secure_tcp is currently a static variable, while it needs to be a per netns variable. Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29ipvs: don't ignore errors in case refcounting ip_vs module failsDavide Caratti
commit 62931f59ce9cbabb934a431f48f2f1f441c605ac upstream. if the IPVS module is removed while the sync daemon is starting, there is a small gap where try_module_get() might fail getting the refcount inside ip_vs_use_count_inc(). Then, the refcounts of IPVS module are unbalanced, and the subsequent call to stop_sync_thread() causes the following splat: WARNING: CPU: 0 PID: 4013 at kernel/module.c:1146 module_put.part.44+0x15b/0x290 Modules linked in: ip_vs(-) nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth ip6table_filter ip6_tables iptable_filter binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ext4 mbcache jbd2 ghash_clmulni_intel snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev pcspkr snd_timer virtio_balloon snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk failover virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ata_piix ttm crc32c_intel serio_raw drm virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_defrag_ipv6] CPU: 0 PID: 4013 Comm: modprobe Tainted: G W 5.4.0-rc1.upstream+ #741 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:module_put.part.44+0x15b/0x290 Code: 04 25 28 00 00 00 0f 85 18 01 00 00 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 89 44 24 28 83 e8 01 89 c5 0f 89 57 ff ff ff <0f> 0b e9 78 ff ff ff 65 8b 1d 67 83 26 4a 89 db be 08 00 00 00 48 RSP: 0018:ffff888050607c78 EFLAGS: 00010297 RAX: 0000000000000003 RBX: ffffffffc1420590 RCX: ffffffffb5db0ef9 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffffc1420590 RBP: 00000000ffffffff R08: fffffbfff82840b3 R09: fffffbfff82840b3 R10: 0000000000000001 R11: fffffbfff82840b2 R12: 1ffff1100a0c0f90 R13: ffffffffc1420200 R14: ffff88804f533300 R15: ffff88804f533ca0 FS: 00007f8ea9720740(0000) GS:ffff888053800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3245abe000 CR3: 000000004c28a006 CR4: 00000000001606f0 Call Trace: stop_sync_thread+0x3a3/0x7c0 [ip_vs] ip_vs_sync_net_cleanup+0x13/0x50 [ip_vs] ops_exit_list.isra.5+0x94/0x140 unregister_pernet_operations+0x29d/0x460 unregister_pernet_device+0x26/0x60 ip_vs_cleanup+0x11/0x38 [ip_vs] __x64_sys_delete_module+0x2d5/0x400 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f8ea8bf0db7 Code: 73 01 c3 48 8b 0d b9 80 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 80 2c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffcd38d2fe8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000000002436240 RCX: 00007f8ea8bf0db7 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000024362a8 RBP: 0000000000000000 R08: 00007f8ea8eba060 R09: 00007f8ea8c658a0 R10: 00007ffcd38d2a60 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000001 R14: 00000000024362a8 R15: 0000000000000000 irq event stamp: 4538 hardirqs last enabled at (4537): [<ffffffffb6193dde>] quarantine_put+0x9e/0x170 hardirqs last disabled at (4538): [<ffffffffb5a0556a>] trace_hardirqs_off_thunk+0x1a/0x20 softirqs last enabled at (4522): [<ffffffffb6f8ebe9>] sk_common_release+0x169/0x2d0 softirqs last disabled at (4520): [<ffffffffb6f8eb3e>] sk_common_release+0xbe/0x2d0 Check the return value of ip_vs_use_count_inc() and let its caller return proper error. Inside do_ip_vs_set_ctl() the module is already refcounted, we don't need refcount/derefcount there. Finally, in register_ip_vs_app() and start_sync_thread(), take the module refcount earlier and ensure it's released in the error path. Change since v1: - better return values in case of failure of ip_vs_use_count_inc(), thanks to Julian Anastasov - no need to increase/decrease the module refcount in ip_vs_set_ctl(), thanks to Julian Anastasov Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29xsk: Fix registration of Rx-only socketsMagnus Karlsson
commit 2afd23f78f39da84937006ecd24aa664a4ab052b upstream. Having Rx-only AF_XDP sockets can potentially lead to a crash in the system by a NULL pointer dereference in xsk_umem_consume_tx(). This function iterates through a list of all sockets tied to a umem and checks if there are any packets to send on the Tx ring. Rx-only sockets do not have a Tx ring, so this will cause a NULL pointer dereference. This will happen if you have registered one or more Rx-only sockets to a umem and the driver is checking the Tx ring even on Rx, or if the XDP_SHARED_UMEM mode is used and there is a mix of Rx-only and other sockets tied to the same umem. Fixed by only putting sockets with a Tx component on the list that xsk_umem_consume_tx() iterates over. Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions") Reported-by: Kal Cutter Conley <kal.conley@dectris.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1571645818-16244-1-git-send-email-magnus.karlsson@intel.com Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-12-29netfilter: nf_flow_table: set timeout before insertion into hashesPablo Neira Ayuso
commit daf61b026f4686250e6afa619e6d7b49edc61df7 upstream. Other garbage collector might remove an entry not fully set up yet. [570953.958293] RIP: 0010:memcmp+0x9/0x50 [...] [570953.958567] flow_offload_hash_cmp+0x1e/0x30 [nf_flow_table] [570953.958585] flow_offload_lookup+0x8c/0x110 [nf_flow_table] [570953.958606] nf_flow_offload_ip_hook+0x135/0xb30 [nf_flow_table] [570953.958624] nf_flow_offload_inet_hook+0x35/0x37 [nf_flow_table_inet] [570953.958646] nf_hook_slow+0x3c/0xb0 [570953.958664] __netif_receive_skb_core+0x90f/0xb10 [570953.958678] ? ip_rcv_finish+0x82/0xa0 [570953.958692] __netif_receive_skb_one_core+0x3b/0x80 [570953.958711] __netif_receive_skb+0x18/0x60 [570953.958727] netif_receive_skb_internal+0x45/0xf0 [570953.958741] napi_gro_receive+0xcd/0xf0 [570953.958764] ixgbe_clean_rx_irq+0x432/0xe00 [ixgbe] [570953.958782] ixgbe_poll+0x27b/0x700 [ixgbe] [570953.958796] net_rx_action+0x284/0x3c0 [570953.958817] __do_softirq+0xcc/0x27c [570953.959464] irq_exit+0xe8/0x100 [570953.960097] do_IRQ+0x59/0xe0 [570953.960734] common_interrupt+0xf/0xf Fixes: 43c8f131184f ("netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>