aboutsummaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2019-08-02xfrm: clean up xfrm protocol checksCong Wang
commit dbb2483b2a46fbaf833cfb5deb5ed9cace9c7399 upstream. In commit 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") I introduced a check for xfrm protocol, but according to Herbert IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so it should be removed from validate_tmpl(). And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific protocols, this is why xfrm_state_flush() could still miss IPPROTO_ROUTING, which leads that those entries are left in net->xfrm.state_all before exit net. Fix this by replacing IPSEC_PROTO_ANY with zero. This patch also extracts the check from validate_tmpl() to xfrm_id_proto_valid() and uses it in parse_ipsecrequest(). With this, no other protocols should be added into xfrm. Fixes: 6a53b7593233 ("xfrm: check id proto in validate_tmpl()") Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02vti4: ipip tunnel deregistration fixes.Jeremy Sowden
commit 5483844c3fc18474de29f5d6733003526e0a9f78 upstream. If tunnel registration failed during module initialization, the module would fail to deregister the IPPROTO_COMP protocol and would attempt to deregister the tunnel. The tunnel was not deregistered during module-exit. Fixes: dd9ee3444014e ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02xfrm6_tunnel: Fix potential panic when unloading xfrm6_tunnel moduleSu Yanjun
commit 6ee02a54ef990a71bf542b6f0a4e3321de9d9c66 upstream. When unloading xfrm6_tunnel module, xfrm6_tunnel_fini directly frees the xfrm6_tunnel_spi_kmem. Maybe someone has gotten the xfrm6_tunnel_spi, so need to wait it. Fixes: 91cc3bb0b04ff("xfrm6_tunnel: RCU conversion") Signed-off-by: Su Yanjun <suyj.fnst@cn.fujitsu.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02xfrm: policy: Fix out-of-bound array accesses in __xfrm_policy_unlinkYueHaibing
commit b805d78d300bcf2c83d6df7da0c818b0fee41427 upstream. UBSAN report this: UBSAN: Undefined behaviour in net/xfrm/xfrm_policy.c:1289:24 index 6 is out of range for type 'unsigned int [6]' CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.4.162-514.55.6.9.x86_64+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 0000000000000000 1466cf39b41b23c9 ffff8801f6b07a58 ffffffff81cb35f4 0000000041b58ab3 ffffffff83230f9c ffffffff81cb34e0 ffff8801f6b07a80 ffff8801f6b07a20 1466cf39b41b23c9 ffffffff851706e0 ffff8801f6b07ae8 Call Trace: <IRQ> [<ffffffff81cb35f4>] __dump_stack lib/dump_stack.c:15 [inline] <IRQ> [<ffffffff81cb35f4>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51 [<ffffffff81d94225>] ubsan_epilogue+0x12/0x8f lib/ubsan.c:164 [<ffffffff81d954db>] __ubsan_handle_out_of_bounds+0x16e/0x1b2 lib/ubsan.c:382 [<ffffffff82a25acd>] __xfrm_policy_unlink+0x3dd/0x5b0 net/xfrm/xfrm_policy.c:1289 [<ffffffff82a2e572>] xfrm_policy_delete+0x52/0xb0 net/xfrm/xfrm_policy.c:1309 [<ffffffff82a3319b>] xfrm_policy_timer+0x30b/0x590 net/xfrm/xfrm_policy.c:243 [<ffffffff813d3927>] call_timer_fn+0x237/0x990 kernel/time/timer.c:1144 [<ffffffff813d8e7e>] __run_timers kernel/time/timer.c:1218 [inline] [<ffffffff813d8e7e>] run_timer_softirq+0x6ce/0xb80 kernel/time/timer.c:1401 [<ffffffff8120d6f9>] __do_softirq+0x299/0xe10 kernel/softirq.c:273 [<ffffffff8120e676>] invoke_softirq kernel/softirq.c:350 [inline] [<ffffffff8120e676>] irq_exit+0x216/0x2c0 kernel/softirq.c:391 [<ffffffff82c5edab>] exiting_irq arch/x86/include/asm/apic.h:652 [inline] [<ffffffff82c5edab>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926 [<ffffffff82c5c985>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:735 <EOI> [<ffffffff81188096>] ? native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:52 [<ffffffff810834d7>] arch_safe_halt arch/x86/include/asm/paravirt.h:111 [inline] [<ffffffff810834d7>] default_idle+0x27/0x430 arch/x86/kernel/process.c:446 [<ffffffff81085f05>] arch_cpu_idle+0x15/0x20 arch/x86/kernel/process.c:437 [<ffffffff8132abc3>] default_idle_call+0x53/0x90 kernel/sched/idle.c:92 [<ffffffff8132b32d>] cpuidle_idle_call kernel/sched/idle.c:156 [inline] [<ffffffff8132b32d>] cpu_idle_loop kernel/sched/idle.c:251 [inline] [<ffffffff8132b32d>] cpu_startup_entry+0x60d/0x9a0 kernel/sched/idle.c:299 [<ffffffff8113e119>] start_secondary+0x3c9/0x560 arch/x86/kernel/smpboot.c:245 The issue is triggered as this: xfrm_add_policy -->verify_newpolicy_info //check the index provided by user with XFRM_POLICY_MAX //In my case, the index is 0x6E6BB6, so it pass the check. -->xfrm_policy_construct //copy the user's policy and set xfrm_policy_timer -->xfrm_policy_insert --> __xfrm_policy_link //use the orgin dir, in my case is 2 --> xfrm_gen_index //generate policy index, there is 0x6E6BB6 then xfrm_policy_timer be fired xfrm_policy_timer --> xfrm_policy_id2dir //get dir from (policy index & 7), in my case is 6 --> xfrm_policy_delete --> __xfrm_policy_unlink //access policy_count[dir], trigger out of range access Add xfrm_policy_id2dir check in verify_newpolicy_info, make sure the computed dir is valid, to fix the issue. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: e682adf021be ("xfrm: Try to honor policy index if it's supplied by user") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02vsock/virtio: Initialize core virtio vsock before registering the driverJorge E. Moreira
commit ba95e5dfd36647622d8897a2a0470dde60e59ffd upstream. Avoid a race in which static variables in net/vmw_vsock/af_vsock.c are accessed (while handling interrupts) before they are initialized. [ 4.201410] BUG: unable to handle kernel paging request at ffffffffffffffe8 [ 4.207829] IP: vsock_addr_equals_addr+0x3/0x20 [ 4.211379] PGD 28210067 P4D 28210067 PUD 28212067 PMD 0 [ 4.211379] Oops: 0000 [#1] PREEMPT SMP PTI [ 4.211379] Modules linked in: [ 4.211379] CPU: 1 PID: 30 Comm: kworker/1:1 Not tainted 4.14.106-419297-gd7e28cc1f241 #1 [ 4.211379] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 4.211379] Workqueue: virtio_vsock virtio_transport_rx_work [ 4.211379] task: ffffa3273d175280 task.stack: ffffaea1800e8000 [ 4.211379] RIP: 0010:vsock_addr_equals_addr+0x3/0x20 [ 4.211379] RSP: 0000:ffffaea1800ebd28 EFLAGS: 00010286 [ 4.211379] RAX: 0000000000000002 RBX: 0000000000000000 RCX: ffffffffb94e42f0 [ 4.211379] RDX: 0000000000000400 RSI: ffffffffffffffe0 RDI: ffffaea1800ebdd0 [ 4.211379] RBP: ffffaea1800ebd58 R08: 0000000000000001 R09: 0000000000000001 [ 4.211379] R10: 0000000000000000 R11: ffffffffb89d5d60 R12: ffffaea1800ebdd0 [ 4.211379] R13: 00000000828cbfbf R14: 0000000000000000 R15: ffffaea1800ebdc0 [ 4.211379] FS: 0000000000000000(0000) GS:ffffa3273fd00000(0000) knlGS:0000000000000000 [ 4.211379] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.211379] CR2: ffffffffffffffe8 CR3: 000000002820e001 CR4: 00000000001606e0 [ 4.211379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4.211379] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4.211379] Call Trace: [ 4.211379] ? vsock_find_connected_socket+0x6c/0xe0 [ 4.211379] virtio_transport_recv_pkt+0x15f/0x740 [ 4.211379] ? detach_buf+0x1b5/0x210 [ 4.211379] virtio_transport_rx_work+0xb7/0x140 [ 4.211379] process_one_work+0x1ef/0x480 [ 4.211379] worker_thread+0x312/0x460 [ 4.211379] kthread+0x132/0x140 [ 4.211379] ? process_one_work+0x480/0x480 [ 4.211379] ? kthread_destroy_worker+0xd0/0xd0 [ 4.211379] ret_from_fork+0x35/0x40 [ 4.211379] Code: c7 47 08 00 00 00 00 66 c7 07 28 00 c7 47 08 ff ff ff ff c7 47 04 ff ff ff ff c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 8b 47 08 <3b> 46 08 75 0a 8b 47 04 3b 46 04 0f 94 c0 c3 31 c0 c3 90 66 2e [ 4.211379] RIP: vsock_addr_equals_addr+0x3/0x20 RSP: ffffaea1800ebd28 [ 4.211379] CR2: ffffffffffffffe8 [ 4.211379] ---[ end trace f31cc4a2e6df3689 ]--- [ 4.211379] Kernel panic - not syncing: Fatal exception in interrupt [ 4.211379] Kernel Offset: 0x37000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 4.211379] Rebooting in 5 seconds.. Fixes: 22b5c0b63f32 ("vsock/virtio: fix kernel panic after device hot-unplug") Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: kvm@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org Cc: kernel-team@android.com Cc: stable@vger.kernel.org [4.9+] Signed-off-by: Jorge E. Moreira <jemoreira@google.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02tipc: fix modprobe tipc failed after switch order of device registrationJunwei Hu
commit 532b0f7ece4cb2ffd24dc723ddf55242d1188e5e upstream. Error message printed: modprobe: ERROR: could not insert 'tipc': Address family not supported by protocol. when modprobe tipc after the following patch: switch order of device registration, commit 7e27e8d6130c ("tipc: switch order of device registration to fix a crash") Because sock_create_kern(net, AF_TIPC, ...) is called by tipc_topsrv_create_listener() in the initialization process of tipc_net_ops, tipc_socket_init() must be execute before that. I move tipc_socket_init() into function tipc_init_net(). Fixes: 7e27e8d6130c ("tipc: switch order of device registration to fix a crash") Signed-off-by: Junwei Hu <hujunwei4@huawei.com> Reported-by: Wang Wang <wangwang2@huawei.com> Reviewed-by: Kang Zhou <zhoukang7@huawei.com> Reviewed-by: Suanming Mou <mousuanming@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02vsock/virtio: free packets during the socket releaseStefano Garzarella
commit ac03046ece2b158ebd204dfc4896fd9f39f0e6c8 upstream. When the socket is released, we should free all packets queued in the per-socket list in order to avoid a memory leak. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02tipc: switch order of device registration to fix a crashJunwei Hu
commit 7e27e8d6130c5e88fac9ddec4249f7f2337fe7f8 upstream. When tipc is loaded while many processes try to create a TIPC socket, a crash occurs: PANIC: Unable to handle kernel paging request at virtual address "dfff20000000021d" pc : tipc_sk_create+0x374/0x1180 [tipc] lr : tipc_sk_create+0x374/0x1180 [tipc] Exception class = DABT (current EL), IL = 32 bits Call trace: tipc_sk_create+0x374/0x1180 [tipc] __sock_create+0x1cc/0x408 __sys_socket+0xec/0x1f0 __arm64_sys_socket+0x74/0xa8 ... This is due to race between sock_create and unfinished register_pernet_device. tipc_sk_insert tries to do "net_generic(net, tipc_net_id)". but tipc_net_id is not initialized yet. So switch the order of the two to close the race. This can be reproduced with multiple processes doing socket(AF_TIPC, ...) and one process doing module removal. Fixes: a62fbccecd62 ("tipc: make subscriber server support net namespace") Signed-off-by: Junwei Hu <hujunwei4@huawei.com> Reported-by: Wang Wang <wangwang2@huawei.com> Reviewed-by: Xiaogang Wang <wangxiaogang3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02rtnetlink: always put IFLA_LINK for links with a link-netnsidSabrina Dubroca
commit feadc4b6cf42a53a8a93c918a569a0b7e62bd350 upstream. Currently, nla_put_iflink() doesn't put the IFLA_LINK attribute when iflink == ifindex. In some cases, a device can be created in a different netns with the same ifindex as its parent. That device will not dump its IFLA_LINK attribute, which can confuse some userspace software that expects it. For example, if the last ifindex created in init_net and foo are both 8, these commands will trigger the issue: ip link add parent type dummy # ifindex 9 ip link add link parent netns foo type macvlan # ifindex 9 in ns foo So, in case a device puts the IFLA_LINK_NETNSID attribute in a dump, always put the IFLA_LINK attribute as well. Thanks to Dan Winship for analyzing the original OpenShift bug down to the missing netlink attribute. v2: change Fixes tag, it's been here forever, as Nicolas Dichtel said add Nicolas' ack v3: change Fixes tag fix subject typo, spotted by Edward Cree Analyzed-by: Dan Winship <danw@redhat.com> Fixes: d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02net: avoid weird emergency messageEric Dumazet
commit d7c04b05c9ca14c55309eb139430283a45c4c25f upstream. When host is under high stress, it is very possible thread running netdev_wait_allrefs() returns from msleep(250) 10 seconds late. This leads to these messages in the syslog : [...] unregister_netdevice: waiting for syz_tun to become free. Usage count = 0 If the device refcount is zero, the wait is over. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02ipv6: prevent possible fib6 leaksEric Dumazet
commit 61fb0d01680771f72cc9d39783fb2c122aaad51e upstream. At ipv6 route dismantle, fib6_drop_pcpu_from() is responsible for finding all percpu routes and set their ->from pointer to NULL, so that fib6_ref can reach its expected value (1). The problem right now is that other cpus can still catch the route being deleted, since there is no rcu grace period between the route deletion and call to fib6_drop_pcpu_from() This can leak the fib6 and associated resources, since no notifier will take care of removing the last reference(s). I decided to add another boolean (fib6_destroying) instead of reusing/renaming exception_bucket_flushed to ease stable backports, and properly document the memory barriers used to implement this fix. This patch has been co-developped with Wei Wang. Fixes: 93531c674315 ("net/ipv6: separate handling of FIB entries from dst based routes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Wei Wang <weiwan@google.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin Lau <kafai@fb.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-08-02ipv6: fix src addr routing with the exception tableWei Wang
[ Upstream commit 510e2ceda031eed97a7a0f9aad65d271a58b460d ] When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1. When the route is a gateway route or does not have next hop. (rt6_is_gw_or_nonexthop() == false) 2. When calling ip6_rt_cache_alloc(), saddr is passed in as NULL. This means, when looking for a route cache in the exception table, we have to do the lookup twice: first time with the passed in /128 host address, second time with the src_addr stored in fib6_info. This solves the pmtu discovery issue reported by Mikael Magnusson where a route cache with a lower mtu info is created for a gateway route with src addr. However, the lookup code is not able to find this route cache. Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache") Reported-by: Mikael Magnusson <mikael.kernel@lists.m7n.se> Bisected-by: David Ahern <dsahern@gmail.com> Signed-off-by: Wei Wang <weiwan@google.com> Cc: Martin Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [PG: use 4.19.46 stable version for the 4.18.x baseline here.] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30tipc: fix hanging clients using poll with EPOLLOUT flagParthasarathy Bhuvaragan
commit ff946833b70e0c7f93de9a3f5b329b5ae2287b38 upstream. commit 517d7c79bdb398 ("tipc: fix hanging poll() for stream sockets") introduced a regression for clients using non-blocking sockets. After the commit, we send EPOLLOUT event to the client even in TIPC_CONNECTING state. This causes the subsequent send() to fail with ENOTCONN, as the socket is still not in TIPC_ESTABLISHED state. In this commit, we: - improve the fix for hanging poll() by replacing sk_data_ready() with sk_state_change() to wake up all clients. - revert the faulty updates introduced by commit 517d7c79bdb398 ("tipc: fix hanging poll() for stream sockets"). Fixes: 517d7c79bdb398 ("tipc: fix hanging poll() for stream sockets") Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.se> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30vrf: sit mtu should not be updated when vrf netdev is the linkStephen Suryaputra
commit ff6ab32bd4e073976e4d8797b4d514a172cfe6cb upstream. VRF netdev mtu isn't typically set and have an mtu of 65536. When the link of a tunnel is set, the tunnel mtu is changed from 1480 to the link mtu minus tunnel header. In the case of VRF netdev is the link, then the tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in this case. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30vlan: disable SIOCSHWTSTAMP in containerHangbin Liu
commit 873017af778439f2f8e3d87f28ddb1fcaf244a76 upstream. With NET_ADMIN enabled in container, a normal user could be mapped to root and is able to change the real device's rx filter via ioctl on vlan, which would affect the other ptp process on host. Fix it by disabling SIOCSHWTSTAMP in container. Fixes: a6111d3c93d0 ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30packet: Fix error path in packet_initYueHaibing
commit 36096f2f4fa05f7678bc87397665491700bae757 upstream. kernel BUG at lib/list_debug.c:47! invalid opcode: 0000 [#1 CPU: 0 PID: 12914 Comm: rmmod Tainted: G W 5.1.0+ #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__list_del_entry_valid+0x53/0x90 Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48 89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2 RSP: 0018:ffffc90001c2fe40 EFLAGS: 00010286 RAX: 000000000000004e RBX: ffffffffa0184000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff888237a17788 RDI: 00000000ffffffff RBP: ffffc90001c2fe40 R08: 0000000000000000 R09: 0000000000000000 R10: ffffc90001c2fe10 R11: 0000000000000000 R12: 0000000000000000 R13: ffffc90001c2fe50 R14: ffffffffa0184000 R15: 0000000000000000 FS: 00007f3d83634540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555c350ea818 CR3: 0000000231677000 CR4: 00000000000006f0 Call Trace: unregister_pernet_operations+0x34/0x120 unregister_pernet_subsys+0x1c/0x30 packet_exit+0x1c/0x369 [af_packet __x64_sys_delete_module+0x156/0x260 ? lockdep_hardirqs_on+0x133/0x1b0 ? do_syscall_64+0x12/0x1f0 do_syscall_64+0x6e/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe When modprobe af_packet, register_pernet_subsys fails and does a cleanup, ops->list is set to LIST_POISON1, but the module init is considered to success, then while rmmod it, BUG() is triggered in __list_del_entry_valid which is called from unregister_pernet_subsys. This patch fix error handing path in packet_init to avoid possilbe issue if some error occur. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30net: dsa: Fix error cleanup path in dsa_init_moduleYueHaibing
commit 68be930249d051fd54d3d99156b3dcadcb2a1f9b upstream. BUG: unable to handle kernel paging request at ffffffffa01c5430 PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0 Oops: 0000 [#1 CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:raw_notifier_chain_register+0x16/0x40 Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08 RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282 RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068 RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700 R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78 FS: 00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0 Call Trace: register_netdevice_notifier+0x43/0x250 ? 0xffffffffa01e0000 dsa_slave_register_notifier+0x13/0x70 [dsa_core ? 0xffffffffa01e0000 dsa_init_module+0x2e/0x1000 [dsa_core do_one_initcall+0x6c/0x3cc ? do_init_module+0x22/0x1f1 ? rcu_read_lock_sched_held+0x97/0xb0 ? kmem_cache_alloc_trace+0x325/0x3b0 do_init_module+0x5b/0x1f1 load_module+0x1db1/0x2690 ? m_show+0x1d0/0x1d0 __do_sys_finit_module+0xc5/0xd0 __x64_sys_finit_module+0x15/0x20 do_syscall_64+0x6b/0x1d0 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cleanup allocated resourses if there are errors, otherwise it will trgger memleak. Fixes: c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30ipv4: Fix raw socket lookup for local trafficDavid Ahern
commit 19e4e768064a87b073a4b4c138b55db70e0cfb9f upstream. inet_iif should be used for the raw socket lookup. inet_iif considers rt_iif which handles the case of local traffic. As it stands, ping to a local address with the '-I <dev>' option fails ever since ping was changed to use SO_BINDTODEVICE instead of cmsg + IP_PKTINFO. IPv6 works fine. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL ↵Hangbin Liu
not supplied commit e9919a24d3022f72bcadc407e73a6ef17093a849 upstream. With commit 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") we now able to check if a rule already exists. But this only works with iproute2. For other tools like libnl, NetworkManager, it still could add duplicate rules with only NLM_F_CREATE flag, like [localhost ~ ]# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default 100000: from 192.168.7.5 lookup 5 100000: from 192.168.7.5 lookup 5 As it doesn't make sense to create two duplicate rules, let's just return 0 if the rule exists. Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") Reported-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30bridge: Fix error path for kobject_init_and_add()Tobin C. Harding
commit bdfad5aec1392b93495b77b864d58d7f101dc1c1 upstream. Currently error return from kobject_init_and_add() is not followed by a call to kobject_put(). This means there is a memory leak. We currently set p to NULL so that kfree() may be called on it as a noop, the code is arguably clearer if we move the kfree() up closer to where it is called (instead of after goto jump). Remove a goto label 'err1' and jump to call to kobject_put() in error return from kobject_init_and_add() fixing the memory leak. Re-name goto label 'put_back' to 'err1' now that we don't use err1, following current nomenclature (err1, err2 ...). Move call to kfree out of the error code at bottom of function up to closer to where memory was allocated. Add comment to clarify call to kfree(). Signed-off-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30NFC: nci: Add some bounds checking in nci_hci_cmd_received()Dan Carpenter
commit d7ee81ad09f072eab1681877fc71ec05f9c1ae92 upstream. This is similar to commit 674d9de02aa7 ("NFC: Fix possible memory corruption when handling SHDLC I-Frame commands"). I'm not totally sure, but I think that commit description may have overstated the danger. I was under the impression that this data came from the firmware? If you can't trust your networking firmware, then you're already in trouble. Anyway, these days we add bounds checking where ever we can and we call it kernel hardening. Better safe than sorry. Fixes: 11f54f228643 ("NFC: nci: Add HCI over NCI protocol support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30netfilter: nf_tables: use-after-free in dynamic operationsPablo Neira Ayuso
commit 3f3a390dbd59d236f62cff8e8b20355ef7069e3d upstream. Smatch reports: net/netfilter/nf_tables_api.c:2167 nf_tables_expr_destroy() error: dereferencing freed memory 'expr->ops' net/netfilter/nf_tables_api.c 2162 static void nf_tables_expr_destroy(const struct nft_ctx *ctx, 2163 struct nft_expr *expr) 2164 { 2165 if (expr->ops->destroy) 2166 expr->ops->destroy(ctx, expr); ^^^^ --> 2167 module_put(expr->ops->type->owner); ^^^^^^^^^ 2168 } Smatch says there are three functions which free expr->ops. Fixes: b8e204006340 ("netfilter: nft_compat: use .release_ops and remove list of extension") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30netfilter: fix nf_l4proto_log_invalid to log invalid packetsAndrei Vagin
commit d48668052b2603b6262459625c86108c493588dd upstream. It doesn't log a packet if sysctl_log_invalid isn't equal to protonum OR sysctl_log_invalid isn't equal to IPPROTO_RAW. This sentence is always true. I believe we need to replace OR to AND. Cc: Florian Westphal <fw@strlen.de> Fixes: c4f3db1595827 ("netfilter: conntrack: add and use nf_l4proto_log_invalid") Signed-off-by: Andrei Vagin <avagin@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30netfilter: nf_tables: prevent shift wrap in nft_chain_parse_hook()Dan Carpenter
commit 33d1c018179d0a30c39cc5f1682b77867282694b upstream. I believe that "hook->num" can be up to UINT_MAX. Shifting more than 31 bits would is undefined in C but in practice it would lead to shift wrapping. That would lead to an array overflow in nf_tables_addchain(): ops->hook = hook.type->hooks[ops->hooknum]; Fixes: fe19c04ca137 ("netfilter: nf_tables: remove nhooks field from struct nft_af_info") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30netfilter: ctnetlink: don't use conntrack/expect object addresses as idFlorian Westphal
commit 3c79107631db1f7fd32cf3f7368e4672004a3010 upstream. else, we leak the addresses to userspace via ctnetlink events and dumps. Compute an ID on demand based on the immutable parts of nf_conn struct. Another advantage compared to using an address is that there is no immediate re-use of the same ID in case the conntrack entry is freed and reallocated again immediately. Fixes: 3583240249ef ("[NETFILTER]: nf_conntrack_expect: kill unique ID") Fixes: 7f85f914721f ("[NETFILTER]: nf_conntrack: kill unique ID") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30ipvs: do not schedule icmp errors from tunnelsJulian Anastasov
commit 0261ea1bd1eb0da5c0792a9119b8655cf33c80a3 upstream. We can receive ICMP errors from client or from tunneling real server. While the former can be scheduled to real server, the latter should not be scheduled, they are decapsulated only when existing connection is found. Fixes: 6044eeffafbe ("ipvs: attempt to schedule icmp packets") Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30nl80211: Add NL80211_FLAG_CLEAR_SKB flag for other NL commandsSunil Dutt
commit d6db02a88a4aaa1cd7105137c67ddec7f3bdbc05 upstream. This commit adds NL80211_FLAG_CLEAR_SKB flag to other NL commands that carry key data to ensure they do not stick around on heap after the SKB is freed. Also introduced this flag for NL80211_CMD_VENDOR as there are sub commands which configure the keys. Signed-off-by: Sunil Dutt <usdutt@codeaurora.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30mac80211: fix memory accounting with A-MSDU aggregationFelix Fietkau
commit eb9b64e3a9f8483e6e54f4e03b2ae14ae5db2690 upstream. skb->truesize can change due to memory reallocation or when adding extra fragments. Adjust fq->memory_usage accordingly Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30cfg80211: Handle WMM rules in regulatory domain intersectionIlan Peer
commit 08a75a887ee46828b54600f4bb7068d872a5edd5 upstream. The support added for regulatory WMM rules did not handle the case of regulatory domain intersections. Fix it. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Fixes: 230ebaa189af ("cfg80211: read wmm rules from regulatory database") Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30mac80211: Increase MAX_MSG_LENAndrei Otcheretianski
commit 78be2d21cc1cd3069c6138dcfecec62583130171 upstream. Looks that 100 chars isn't enough for messages, as we keep getting warnings popping from different places due to message shortening. Instead of trying to shorten the prints, just increase the buffer size. Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30mac80211: fix unaligned access in mesh table hash functionFelix Fietkau
commit 40586e3fc400c00c11151804dcdc93f8c831c808 upstream. The pointer to the last four bytes of the address is not guaranteed to be aligned, so we need to use __get_unaligned_cpu32 here Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30Bluetooth: Align minimum encryption key size for LE and BR/EDR connectionsMarcel Holtmann
commit d5bb334a8e171b262e48f378bd2096c0ea458265 upstream. The minimum encryption key size for LE connections is 56 bits and to align LE with BR/EDR, enforce 56 bits of minimum encryption key size for BR/EDR connections as well. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-30Bluetooth: hidp: fix buffer overflowYoung Xiao
commit a1616a5ac99ede5d605047a9012481ce7ff18b16 upstream. Struct ca is copied from userspace. It is not checked whether the "name" field is NULL terminated, which allows local users to obtain potentially sensitive information from kernel stack memory, via a HIDPCONNADD command. This vulnerability is similar to CVE-2011-1079. Signed-off-by: Young Xiao <YangX92@hotmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23mac80211: Honor SW_CRYPTO_CONTROL for unicast keys in AP VLAN modeAlexander Wetzel
commit 78ad2341521d5ea96cb936244ed4c4c4ef9ec13b upstream. Restore SW_CRYPTO_CONTROL operation on AP_VLAN interfaces for unicast keys, the original override was intended to be done for group keys as those are treated specially by mac80211 and would always have been rejected. Now the situation is that AP_VLAN support must be enabled by the driver if it can support it (meaning it can support software crypto GTK TX). Thus, also simplify the code - if we get here with AP_VLAN and non- pairwise key, software crypto must be used (driver doesn't know about the interface) and can be used (driver must've advertised AP_VLAN if it also uses SW_CRYPTO_CONTROL). Fixes: db3bdcb9c3ff ("mac80211: allow AP_VLAN operation on crypto controlled devices") Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> [rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23batman-adv: fix warning in function batadv_v_elp_get_throughputAnders Roxell
commit ca8c3b922e7032aff6cc3fd05548f4df1f3df90e upstream. When CONFIG_CFG80211 isn't enabled the compiler correcly warns about 'sinfo.pertid' may be unused. It can also happen for other error conditions that it not warn about. net/batman-adv/bat_v_elp.c: In function ‘batadv_v_elp_get_throughput.isra.0’: include/net/cfg80211.h:6370:13: warning: ‘sinfo.pertid’ may be used uninitialized in this function [-Wmaybe-uninitialized] kfree(sinfo->pertid); ~~~~~^~~~~~~~ Rework so that we only release '&sinfo' if cfg80211_get_station returns zero. Fixes: 7d652669b61d ("batman-adv: release station info tidstats") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23batman-adv: Reduce tt_global hash refcnt only for removed entrySven Eckelmann
commit f131a56880d10932931e74773fb8702894a94a75 upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_tt_global_free is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: 7683fdc1e886 ("batman-adv: protect the local and the global trans-tables with rcu") Reported-by: Martin Weinelt <martin@linuxlounge.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23batman-adv: Reduce tt_local hash refcnt only for removed entrySven Eckelmann
commit 3d65b9accab4a7ed5038f6df403fbd5e298398c7 upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_tt_local_remove is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: ef72706a0543 ("batman-adv: protect tt_local_entry from concurrent delete events") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23batman-adv: Reduce claim hash refcnt only for removed entrySven Eckelmann
commit 4ba104f468bbfc27362c393815d03aa18fb7a20f upstream. The batadv_hash_remove is a function which searches the hashtable for an entry using a needle, a hashtable bucket selection function and a compare function. It will lock the bucket list and delete an entry when the compare function matches it with the needle. It returns the pointer to the hlist_node which matches or NULL when no entry matches the needle. The batadv_bla_del_claim is not itself protected in anyway to avoid that any other function is modifying the hashtable between the search for the entry and the call to batadv_hash_remove. It can therefore happen that the entry either doesn't exist anymore or an entry was deleted which is not the same object as the needle. In such an situation, the reference counter (for the reference stored in the hashtable) must not be reduced for the needle. Instead the reference counter of the actually removed entry has to be reduced. Otherwise the reference counter will underflow and the object might be freed before all its references were dropped. The kref helpers reported this problem as: refcount_t: underflow; use-after-free. Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23mac80211: don't attempt to rename ERR_PTR() debugfs dirsJohannes Berg
commit 517879147493a5e1df6b89a50f708f1133fcaddb upstream. We need to dereference the directory to get its parent to be able to rename it, so it's clearly not safe to try to do this with ERR_PTR() pointers. Skip in this case. It seems that this is most likely what was causing the report by syzbot, but I'm not entirely sure as it didn't come with a reproducer this time. Cc: stable@vger.kernel.org Reported-by: syzbot+4ece1a28b8f4730547c9@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23packet: validate msg_namelen in send directlyWillem de Bruijn
commit 486efdc8f6ce802b27e15921d2353cc740c55451 upstream. Packet sockets in datagram mode take a destination address. Verify its length before passing to dev_hard_header. Prior to 2.6.14-rc3, the send code ignored sll_halen. This is established behavior. Directly compare msg_namelen to dev->addr_len. Change v1->v2: initialize addr in all paths Fixes: 6b8d95f1795c4 ("packet: validate address length if non-zero") Suggested-by: David Laight <David.Laight@aculab.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23af_packet: fix raw sockets over 6in4 tunnelNicolas Dichtel
commit 88a8121dc1d3d0dbddd411b79ed236b6b6ea415c upstream. Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel: Here is a example of the setup: $ ip link set ntfp2 up $ ip addr add 10.125.0.1/24 dev ntfp2 $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2 $ ip addr add fd00:cafe:cafe::1/128 dev tun1 $ ip link set dev tun1 up $ ip route add fd00:200::/64 dev tun1 $ scapy >>> p = [] >>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest() >>> send(p, count=1, inter=0.1) >>> quit() $ ip -s link ls dev tun1 | grep -A1 "TX.*errors" TX: bytes packets errors dropped carrier collsns 0 0 1 0 0 0 The problem is that the network offset is set to the hard_header_len of the output device (tun1, ie 14 + 20) and in our case, because the packet is small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes (ipv6 header) starting from the network offset). This problem is more generally related to device with variable hard header length. To avoid a too intrusive patch in the current release, a (ugly) workaround is proposed in this patch. It has to be cleaned up in net-next. Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1 Link: http://patchwork.ozlabs.org/patch/1024489/ Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit") CC: Willem de Bruijn <willemb@google.com> CC: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23packet: Do not leak dev refcounts on error exitJason Gunthorpe
commit d972f3dce8d161e2142da0ab1ef25df00e2f21a9 upstream. 'dev' is non NULL when the addr_len check triggers so it must goto a label that does the dev_put otherwise dev will have a leaked refcount. This bug causes the ib_ipoib module to become unloadable when using systemd-network as it triggers this check on InfiniBand links. Fixes: 99137b7888f4 ("packet: validate address length") Reported-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23packet: validate address length if non-zeroWillem de Bruijn
commit 6b8d95f1795c42161dc0984b6863e95d6acf24ed upstream. Validate packet socket address length if a length is given. Zero length is equivalent to not setting an address. Fixes: 99137b7888f4 ("packet: validate address length") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23packet: validate address lengthWillem de Bruijn
commit 99137b7888f4058087895d035d81c6b2d31015c5 upstream. Packet sockets with SOCK_DGRAM may pass an address for use in dev_hard_header. Ensure that it is of sufficient length. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23sctp: avoid running the sctp state machine recursivelyXin Long
commit fbd019737d71e405f86549fd738f81e2ff3dd073 upstream. Ying triggered a call trace when doing an asconf testing: BUG: scheduling while atomic: swapper/12/0/0x10000100 Call Trace: <IRQ> [<ffffffffa4375904>] dump_stack+0x19/0x1b [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72 [<ffffffffa437b93a>] __schedule+0x9ba/0xa00 [<ffffffffa3cd5326>] __cond_resched+0x26/0x30 [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50 [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200 [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0 [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp] [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp] [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp] [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp] [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp] [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp] [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp] [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp] [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp] As it shows, the first sctp_do_sm() running under atomic context (NET_RX softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later, and this flag is supposed to be used in non-atomic context only. Besides, sctp_do_sm() was called recursively, which is not expected. Vlad tried to fix this recursive call in Commit c0786693404c ("sctp: Fix oops when sending queued ASCONF chunks") by introducing a new command SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will be called in this command again. To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st sctp_do_sm() directly. Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23rxrpc: Fix net namespace cleanupDavid Howells
commit b13023421b5179413421333f602850914f6a7ad8 upstream. In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the ->calls list is empty, emitting error messages if not, and (2) wait for the RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0). To avoid taking the call_lock, the function prechecks ->calls and if empty, it returns to avoid taking the lock - this is wrong, however: it still needs to go and do the second phase and wait for ->nr_calls to become 0. Without this, the rxrpc_net struct may get deallocated before we get to the RCU cleanup for the last calls. This can lead to: Slab corruption (Not tainted): kmalloc-16k start=ffff88802b178000, len=16384 050: 6b 6b 6b 6b 6b 6b 6b 6b 61 6b 6b 6b 6b 6b 6b 6b kkkkkkkkakkkkkkk Note the "61" at offset 0x58. This corresponds to the ->nr_calls member of struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k slab). Fix this by flipping the condition on the if-statement, putting the locked section inside the if-body and dropping the return from there. The function will then always go on to wait for the RCU cleanup on outstanding calls. Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23net/tls: avoid NULL pointer deref on nskb->sk in fallbackJakub Kicinski
commit 2dcb003314032c6efb13a065ffae60d164b2dd35 upstream. update_chksum() accesses nskb->sk before it has been set by complete_skb(), move the init up. Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23l2tp: use rcu_dereference_sk_user_data() in l2tp_udp_encap_recv()Eric Dumazet
commit c1c477217882c610a2ba0268f5faf36c9c092528 upstream. Canonical way to fetch sk_user_data from an encap_rcv() handler called from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(), otherwise compiler might read it multiple times. Fixes: d00fa9adc528 ("il2tp: fix races with tunnel socket close") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23l2ip: fix possible use-after-freeEric Dumazet
commit a622b40035d16196bf19b2b33b854862595245fc upstream. Before taking a refcount on a rcu protected structure, we need to make sure the refcount is not zero. syzbot reported : refcount_t: increment on 0; use-after-free. WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked lib/refcount.c:156 [inline] WARNING: CPU: 1 PID: 23533 at lib/refcount.c:156 refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 23533 Comm: syz-executor.2 Not tainted 5.1.0-rc7+ #93 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x45 kernel/panic.c:571 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline] RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154 Code: 1d 98 2b 2a 06 31 ff 89 de e8 db 2c 40 fe 84 db 75 dd e8 92 2b 40 fe 48 c7 c7 20 7a a1 87 c6 05 78 2b 2a 06 01 e8 7d d9 12 fe <0f> 0b eb c1 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 RSP: 0018:ffff888069f0fba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 000000000000f353 RSI: ffffffff815afcb6 RDI: ffffed100d3e1f67 RBP: ffff888069f0fbb8 R08: ffff88809b1845c0 R09: ffffed1015d23ef1 R10: ffffed1015d23ef0 R11: ffff8880ae91f787 R12: ffff8880a8f26968 R13: 0000000000000004 R14: dffffc0000000000 R15: ffff8880a49a6440 l2tp_tunnel_inc_refcount net/l2tp/l2tp_core.h:240 [inline] l2tp_tunnel_get+0x250/0x580 net/l2tp/l2tp_core.c:173 pppol2tp_connect+0xc00/0x1c70 net/l2tp/l2tp_ppp.c:702 __sys_connect+0x266/0x330 net/socket.c:1808 __do_sys_connect net/socket.c:1819 [inline] __se_sys_connect net/socket.c:1816 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1816 Fixes: 54652eb12c1b ("l2tp: hold tunnel while looking up sessions in l2tp_netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2019-05-23ipv6: invert flowlabel sharing check in process and user modeWillem de Bruijn
commit 95c169251bf734aa555a1e8043e4d88ec97a04ec upstream. A request for a flowlabel fails in process or user exclusive mode must fail if the caller pid or uid does not match. Invert the test. Previously, the test was unsafe wrt PID recycling, but indeed tested for inequality: fl1->owner != fl->owner Fixes: 4f82f45730c68 ("net ip6 flowlabel: Make owner a union of struct pid* and kuid_t") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>