summaryrefslogtreecommitdiffstats
path: root/net
AgeCommit message (Collapse)Author
2019-01-16sunrpc: use-after-free in svc_process_common()Vasily Averin
commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream. if node have NFSv41+ mounts inside several net namespaces it can lead to use-after-free in svc_process_common() svc_process_common() /* Setup reply header */ rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE svc_process_common() can use incorrect rqstp->rq_xprt, its caller function bc_svc_process() takes it from serv->sv_bc_xprt. The problem is that serv is global structure but sv_bc_xprt is assigned per-netnamespace. According to Trond, the whole "let's set up rqstp->rq_xprt for the back channel" is nothing but a giant hack in order to work around the fact that svc_process_common() uses it to find the xpt_ops, and perform a couple of (meaningless for the back channel) tests of xpt_flags. All we really need in svc_process_common() is to be able to run rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr() Bruce J Fields points that this xpo_prep_reply_hdr() call is an awfully roundabout way just to do "svc_putnl(resv, 0);" in the tcp case. This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(), now it calls svc_process_common() with rqstp->rq_xprt = NULL. To adjust reply header svc_process_common() just check rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case. To handle rqstp->rq_xprt = NULL case in functions called from svc_process_common() patch intruduces net namespace pointer svc_rqst->rq_bc_net and adjust SVC_NET() definition. Some other function was also adopted to properly handle described case. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: stable@vger.kernel.org Fixes: 23c20ecd4475 ("NFS: callback up - users counting cleanup") Signed-off-by: J. Bruce Fields <bfields@redhat.com> v2: added lost extern svc_tcp_prep_reply_hdr() Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-139p/net: put a lower bound on msizeDominique Martinet
commit 574d356b7a02c7e1b01a1d9cba8a26b3c2888f45 upstream. If the requested msize is too small (either from command line argument or from the server version reply), we won't get any work done. If it's *really* too small, nothing will work, and this got caught by syzbot recently (on a new kmem_cache_create_usercopy() call) Just set a minimum msize to 4k in both code paths, until someone complains they have a use-case for a smaller msize. We need to check in both mount option and server reply individually because the msize for the first version request would be unchecked with just a global check on clnt->msize. Link: http://lkml.kernel.org/r/1541407968-31350-1-git-send-email-asmadeus@codewreck.org Reported-by: syzbot+0c1d61e4db7db94102ca@syzkaller.appspotmail.com Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13sunrpc: use SVC_NET() in svcauth_gss_* functionsVasily Averin
commit b8be5674fa9a6f3677865ea93f7803c4212f3e10 upstream. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13sunrpc: fix cache_head leak due to queued requestVasily Averin
commit 4ecd55ea074217473f94cfee21bb72864d39f8d7 upstream. After commit d202cce8963d, an expired cache_head can be removed from the cache_detail's hash. However, the expired cache_head may be waiting for a reply from a previously submitted request. Such a cache_head has an increased refcounter and therefore it won't be freed after cache_put(freeme). Because the cache_head was removed from the hash it cannot be found during cache_clean() and can be leaked forever, together with stalled cache_request and other taken resources. In our case we noticed it because an entry in the export cache was holding a reference on a filesystem. Fixes d202cce8963d ("sunrpc: never return expired entries in sunrpc_cache_lookup") Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: stable@kernel.org # 2.6.35 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-13mac80211: free skb fraglist before freeing the skbSara Sharon
[ Upstream commit 34b1e0e9efe101822e83cc62d22443ed3867ae7a ] mac80211 uses the frag list to build AMSDU. When freeing the skb, it may not be really freed, since someone is still holding a reference to it. In that case, when TCP skb is being retransmitted, the pointer to the frag list is being reused, while the data in there is no longer valid. Since we will never get frag list from the network stack, as mac80211 doesn't advertise the capability, we can safely free and nullify it before releasing the SKB. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13nl80211: fix memory leak if validate_pae_over_nl80211() failsJohannes Berg
[ Upstream commit d350a0f431189517b1af0dbbb605c273231a8966 ] If validate_pae_over_nl80211() were to fail in nl80211_crypto_settings(), we might leak the 'connkeys' allocation. Fix this. Fixes: 64bf3d4bc2b0 ("nl80211: Add CONTROL_PORT_OVER_NL80211 attribute") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13SUNRPC: Fix a race with XPRT_CONNECTINGTrond Myklebust
[ Upstream commit cf76785d30712d90185455e752337acdb53d2a5d ] Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that we don't have races between the (asynchronous) socket setup code and tasks in xprt_connect(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13mac80211: fix a kernel panic when TXing after TXQ teardownSara Sharon
[ Upstream commit a50e5fb8db83c5b57392204c21ea6c5c4ccefde6 ] Recently TXQ teardown was moved earlier in ieee80211_unregister_hw(), to avoid a use-after-free of the netdev data. However, interfaces aren't fully removed at the point, and cfg80211_shutdown_all_interfaces can for example, TX a deauth frame. Move the TXQ teardown to the point between cfg80211_shutdown_all_interfaces and the free of netdev queues, so we can be sure they are torn down before netdev is freed, but after there is no ongoing TX. Fixes: 77cfaf52eca5 ("mac80211: Run TXQ teardown code before de-registering interfaces") Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13net/tls: Init routines in create_ctxAtul Gupta
[ Upstream commit 6c0563e442528733219afe15c749eb2cc365da3f ] create_ctx is called from tls_init and tls_hw_prot hence initialize function pointers in common routine. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()Taehee Yoo
[ Upstream commit d4e7df16567b80836a78d31b42f1a9355a636d67 ] rbnode in insert_tree() is rcu protected pointer. So, in order to handle this pointer, _rcu function should be used. rb_link_node_rcu() is a rcu version of rb_link_node(). Fixes: 34848d5c896e ("netfilter: nf_conncount: Split insert and traversal") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13netfilter: nat: can't use dst_hold on noref dstFlorian Westphal
[ Upstream commit 542fbda0f08f1cbbc250f9e59f7537649651d0c8 ] The dst entry might already have a zero refcount, waiting on rcu list to be free'd. Using dst_hold() transitions its reference count to 1, and next dst release will try to free it again -- resulting in a double free: WARNING: CPU: 1 PID: 0 at include/net/dst.h:239 nf_xfrm_me_harder+0xe7/0x130 [nf_nat] RIP: 0010:nf_xfrm_me_harder+0xe7/0x130 [nf_nat] Code: 48 8b 5c 24 60 65 48 33 1c 25 28 00 00 00 75 53 48 83 c4 68 5b 5d 41 5c c3 85 c0 74 0d 8d 48 01 f0 0f b1 0a 74 86 85 c0 75 f3 <0f> 0b e9 7b ff ff ff 29 c6 31 d2 b9 20 00 48 00 4c 89 e7 e8 31 27 Call Trace: nf_nat_ipv4_out+0x78/0x90 [nf_nat_ipv4] nf_hook_slow+0x36/0xd0 ip_output+0x9f/0xd0 ip_forward+0x328/0x440 ip_rcv+0x8a/0xb0 Use dst_hold_safe instead and bail out if we cannot take a reference. Fixes: a4c2fd7f7891 ("net: remove DST_NOCACHE flag") Reported-by: Martin Zaharinov <micron10@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13netfilter: ipset: do not call ipset_nest_end after nla_nest_cancelPan Bian
[ Upstream commit 708abf74dd87f8640871b814faa195fb5970b0e3 ] In the error handling block, nla_nest_cancel(skb, atd) is called to cancel the nest operation. But then, ipset_nest_end(skb, atd) is unexpected called to end the nest operation. This patch calls the ipset_nest_end only on the branch that nla_nest_cancel is not called. Fixes: 45040978c899 ("netfilter: ipset: Fix set:list type crash when flush/dump set in parallel") Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13netfilter: seqadj: re-load tcp header pointer after possible head reallocationFlorian Westphal
[ Upstream commit 530aad77010b81526586dfc09130ec875cd084e4 ] When adjusting sack block sequence numbers, skb_make_writable() gets called to make sure tcp options are all in the linear area, and buffer is not shared. This can cause tcp header pointer to get reallocated, so we must reaload it to avoid memory corruption. This bug pre-dates git history. Reported-by: Neel Mehta <nmehta@google.com> Reported-by: Shane Huntley <shuntley@google.com> Reported-by: Heather Adkins <argv@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13netfilter: nf_tables: fix suspicious RCU usage in nft_chain_stats_replace()Taehee Yoo
[ Upstream commit 4c05ec47384ab3627b62814e8f886e90cc38ce15 ] basechain->stats is rcu protected data which is updated from nft_chain_stats_replace(). This function is executed from the commit phase which holds the pernet nf_tables commit mutex - not the global nfnetlink subsystem mutex. Test commands to reproduce the problem are: %iptables-nft -I INPUT %iptables-nft -Z %iptables-nft -Z This patch uses RCU calls to handle basechain->stats updates to fix a splat that looks like: [89279.358755] ============================= [89279.363656] WARNING: suspicious RCU usage [89279.368458] 4.20.0-rc2+ #44 Tainted: G W L [89279.374661] ----------------------------- [89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious rcu_dereference_protected() usage! [...] [89279.406556] 1 lock held by iptables-nft/5225: [89279.411728] #0: 00000000bf45a000 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1f/0x70 [nf_tables] [89279.424022] stack backtrace: [89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: G W L 4.20.0-rc2+ #44 [89279.430135] Call Trace: [89279.430135] dump_stack+0xc9/0x16b [89279.430135] ? show_regs_print_info+0x5/0x5 [89279.430135] ? lockdep_rcu_suspicious+0x117/0x160 [89279.430135] nft_chain_commit_update+0x4ea/0x640 [nf_tables] [89279.430135] ? sched_clock_local+0xd4/0x140 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables] [89279.430135] ? sched_clock_cpu+0x126/0x170 [89279.430135] ? find_held_lock+0x39/0x1c0 [89279.430135] ? hlock_class+0x140/0x140 [89279.430135] ? is_bpf_text_address+0x5/0xf0 [89279.430135] ? check_flags.part.35+0x440/0x440 [89279.430135] ? __lock_is_held+0xb4/0x140 [89279.430135] nf_tables_commit+0x2555/0x39c0 [nf_tables] Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard transactions") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13xfrm: Fix NULL pointer dereference in xfrm_input when skb_dst_force clears ↵Steffen Klassert
the dst_entry. [ Upstream commit 0152eee6fc3b84298bb6a79961961734e8afa5b8 ] Since commit 222d7dbd258d ("net: prevent dst uses after free") skb_dst_force() might clear the dst_entry attached to the skb. The xfrm code doesn't expect this to happen, so we crash with a NULL pointer dereference in this case. Fix it by checking skb_dst(skb) for NULL after skb_dst_force() and drop the packet in case the dst_entry was cleared. We also move the skb_dst_force() to a codepath that is not used when the transformation was offloaded, because in this case we don't have a dst_entry attached to the skb. The output and forwarding path was already fixed by commit 9e1437937807 ("xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.") Fixes: 222d7dbd258d ("net: prevent dst uses after free") Reported-by: Jean-Philippe Menil <jpmenil@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13xfrm: Fix bucket count reported to userspaceBenjamin Poirier
[ Upstream commit ca92e173ab34a4f7fc4128bd372bd96f1af6f507 ] sadhcnt is reported by `ip -s xfrm state count` as "buckets count", not the hash mask. Fixes: 28d8909bc790 ("[XFRM]: Export SAD info.") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-13xfrm: Fix error return code in xfrm_output_one()Wei Yongjun
[ Upstream commit 533555e5cbb6aa2d77598917871ae5b579fe724b ] xfrm_output_one() does not return a error code when there is no dst_entry attached to the skb, it is still possible crash with a NULL pointer dereference in xfrm_output_resume(). Fix it by return error code -EHOSTUNREACH. Fixes: 9e1437937807 ("xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-01-09ipv6: route: Fix return value of ip6_neigh_lookup() on neigh_create() errorStefano Brivio
[ Upstream commit 7adf3246092f5e87ed0fa610e8088fae416c581f ] In ip6_neigh_lookup(), we must not return errors coming from neigh_create(): if creation of a neighbour entry fails, the lookup should return NULL, in the same way as it's done in __neigh_lookup(). Otherwise, callers legitimately checking for a non-NULL return value of the lookup function might dereference an invalid pointer. For instance, on neighbour table overflow, ndisc_router_discovery() crashes ndisc_update() by passing ERR_PTR(-ENOBUFS) as 'neigh' argument. Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: f8a1b43b709d ("net/ipv6: Create a neigh_lookup for FIB entries") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09net/ipv6: Fix a test against 'ipv6_find_idev()' return valueChristophe JAILLET
[ Upstream commit 178fe94405bffbd1acd83b6ff3b40211185ae9c9 ] 'ipv6_find_idev()' returns NULL on error, not an error pointer. Update the test accordingly and return -ENOBUFS, as already done in 'addrconf_add_dev()', if NULL is returned. Fixes: ("ipv6: allow userspace to add IFA_F_OPTIMISTIC addresses") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ipv6: frags: Fix bogus skb->sk in reassembled packetsHerbert Xu
[ Upstream commit d15f5ac8deea936d3adf629421a66a88b42b8a2f ] It was reported that IPsec would crash when it encounters an IPv6 reassembled packet because skb->sk is non-zero and not a valid pointer. This is because skb->sk is now a union with ip_defrag_offset. This patch fixes this by resetting skb->sk when exiting from the reassembly code. Reported-by: Xiumei Mu <xmu@redhat.com> Fixes: 219badfaade9 ("ipv6: frags: get rid of ip6frag_skb_cb/...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: check group dests after tipc_wait_for_cond()Cong Wang
[ Upstream commit 3c6306d44082ef007a258ae1b86ea58e6974ee3f ] Similar to commit 143ece654f9f ("tipc: check tsk->group in tipc_wait_for_cond()") we have to reload grp->dests too after we re-take the sock lock. This means we need to move the dsts check after tipc_wait_for_cond() too. Fixes: 75da2163dbb6 ("tipc: introduce communication groups") Reported-and-tested-by: syzbot+99f20222fc5018d2b97a@syzkaller.appspotmail.com Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09VSOCK: Send reset control packet when socket is partially boundJorgen Hansen
[ Upstream commit a915b982d8f5e4295f64b8dd37ce753874867e88 ] If a server side socket is bound to an address, but not in the listening state yet, incoming connection requests should receive a reset control packet in response. However, the function used to send the reset silently drops the reset packet if the sending socket isn't bound to a remote address (as is the case for a bound socket not yet in the listening state). This change fixes this by using the src of the incoming packet as destination for the reset packet in this case. Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Vishnu Dasa <vdasa@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: use lock_sock() in tipc_sk_reinit()Cong Wang
[ Upstream commit 15ef70e286176165d28b0b8a969b422561a68dfc ] lock_sock() must be used in process context to be race-free with other lock_sock() callers, for example, tipc_release(). Otherwise using the spinlock directly can't serialize a parallel tipc_release(). As it is blocking, we have to hold the sock refcnt before rhashtable_walk_stop() and release it after rhashtable_walk_start(). Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to use generic rhashtable") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: fix a double kfree_skb()Cong Wang
[ Upstream commit acb4a33e9856d5fa3384b87d3d8369229be06d31 ] tipc_udp_xmit() drops the packet on error, there is no need to drop it again. Fixes: ef20cd4dd163 ("tipc: introduce UDP replicast") Reported-and-tested-by: syzbot+eae585ba2cc2752d3704@syzkaller.appspotmail.com Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: fix a double free in tipc_enable_bearer()Cong Wang
[ Upstream commit dc4501ff287547dea7ca10f1c580c741291a8760 ] bearer_disable() already calls kfree_rcu() to free struct tipc_bearer, we don't need to call kfree() again. Fixes: cb30a63384bc ("tipc: refactor function tipc_enable_bearer()") Reported-by: syzbot+b981acf1fb240c0c128b@syzkaller.appspotmail.com Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: compare remote and local protocols in tipc_udp_enable()Cong Wang
[ Upstream commit fb83ed496b9a654f60cd1d58a0e1e79ec5694808 ] When TIPC_NLA_UDP_REMOTE is an IPv6 mcast address but TIPC_NLA_UDP_LOCAL is an IPv4 address, a NULL-ptr deref is triggered as the UDP tunnel sock is initialized to IPv4 or IPv6 sock merely based on the protocol in local address. We should just error out when the remote address and local address have different protocols. Reported-by: syzbot+eb4da3a20fad2e52555d@syzkaller.appspotmail.com Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tipc: check tsk->group in tipc_wait_for_cond()Cong Wang
[ Upstream commit 143ece654f9f5b37bedea252a990be37e48ae3a5 ] tipc_wait_for_cond() drops socket lock before going to sleep, but tsk->group could be freed right after that release_sock(). So we have to re-check and reload tsk->group after it wakes up. After this patch, tipc_wait_for_cond() returns -ERESTARTSYS when tsk->group is NULL, instead of continuing with the assumption of a non-NULL tsk->group. (It looks like 'dsts' should be re-checked and reloaded too, but it is a different bug.) Similar for tipc_send_group_unicast() and tipc_send_group_anycast(). Reported-by: syzbot+10a9db47c3a0e13eb31c@syzkaller.appspotmail.com Fixes: b7d42635517f ("tipc: introduce flow control for group broadcast messages") Fixes: ee106d7f942d ("tipc: introduce group anycast messaging") Fixes: 27bd9ec027f3 ("tipc: introduce group unicast messaging") Cc: Ying Xue <ying.xue@windriver.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09tcp: fix a race in inet_diag_dump_icsk()Eric Dumazet
[ Upstream commit f0c928d878e7d01b613c9ae5c971a6b1e473a938 ] Alexei reported use after frees in inet_diag_dump_icsk() [1] Because we use refcount_set() when various sockets are setup and inserted into ehash, we also need to make sure inet_diag_dump_icsk() wont race with the refcount_set() operations. Jonathan Lemon sent a patch changing net_twsk_hashdance() but other spots would need risky changes. Instead, fix inet_diag_dump_icsk() as this bug came with linux-4.10 only. [1] Quoting Alexei : First something iterating over sockets finds already freed tw socket: refcount_t: increment on 0; use-after-free. WARNING: CPU: 2 PID: 2738 at lib/refcount.c:153 refcount_inc+0x26/0x30 RIP: 0010:refcount_inc+0x26/0x30 RSP: 0018:ffffc90004c8fbc0 EFLAGS: 00010282 RAX: 000000000000002b RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88085ee9d680 RSI: ffff88085ee954c8 RDI: ffff88085ee954c8 RBP: ffff88010ecbd2c0 R08: 0000000000000000 R09: 000000000000174c R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8806ba9bf210 R14: ffffffff82304600 R15: ffff88010ecbd328 FS: 00007f81f5a7d700(0000) GS:ffff88085ee80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f81e2a95000 CR3: 000000069b2eb006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: inet_diag_dump_icsk+0x2b3/0x4e0 [inet_diag] // sock_hold(sk); in net/ipv4/inet_diag.c:1002 ? kmalloc_large_node+0x37/0x70 ? __kmalloc_node_track_caller+0x1cb/0x260 ? __alloc_skb+0x72/0x1b0 ? __kmalloc_reserve.isra.40+0x2e/0x80 __inet_diag_dump+0x3b/0x80 [inet_diag] netlink_dump+0x116/0x2a0 netlink_recvmsg+0x205/0x3c0 sock_read_iter+0x89/0xd0 __vfs_read+0xf7/0x140 vfs_read+0x8a/0x140 SyS_read+0x3f/0xa0 do_syscall_64+0x5a/0x100 then a minute later twsk timer fires and hits two bad refcnts for this freed socket: refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 31 PID: 0 at lib/refcount.c:228 refcount_dec+0x2e/0x40 Modules linked in: RIP: 0010:refcount_dec+0x2e/0x40 RSP: 0018:ffff88085f5c3ea8 EFLAGS: 00010296 RAX: 000000000000002c RBX: ffff88010ecbd2c0 RCX: 000000000000083f RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f RBP: ffffc90003c77280 R08: 0000000000000000 R09: 00000000000017d3 R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffffffff82ad2d80 R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> inet_twsk_kill+0x9d/0xc0 // inet_twsk_bind_unhash(tw, hashinfo); call_timer_fn+0x29/0x110 run_timer_softirq+0x36b/0x3a0 refcount_t: underflow; use-after-free. WARNING: CPU: 31 PID: 0 at lib/refcount.c:187 refcount_sub_and_test+0x46/0x50 RIP: 0010:refcount_sub_and_test+0x46/0x50 RSP: 0018:ffff88085f5c3eb8 EFLAGS: 00010296 RAX: 0000000000000026 RBX: ffff88010ecbd2c0 RCX: 000000000000083f RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f RBP: ffff88010ecbd358 R08: 0000000000000000 R09: 000000000000185b R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffff88010ecbd358 R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> inet_twsk_put+0x12/0x20 // inet_twsk_put(tw); call_timer_fn+0x29/0x110 run_timer_softirq+0x36b/0x3a0 Fixes: 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <ast@kernel.org> Cc: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09sock: Make sock->sk_stamp thread-safeDeepa Dinamani
[ Upstream commit 3a0ed3e9619738067214871e9cb826fa23b2ddb9 ] Al Viro mentioned (Message-ID <20170626041334.GZ10672@ZenIV.linux.org.uk>) that there is probably a race condition lurking in accesses of sk_stamp on 32-bit machines. sock->sk_stamp is of type ktime_t which is always an s64. On a 32 bit architecture, we might run into situations of unsafe access as the access to the field becomes non atomic. Use seqlocks for synchronization. This allows us to avoid using spinlocks for readers as readers do not need mutual exclusion. Another approach to solve this is to require sk_lock for all modifications of the timestamps. The current approach allows for timestamps to have their own lock: sk_stamp_lock. This allows for the patch to not compete with already existing critical sections, and side effects are limited to the paths in the patch. The addition of the new field maintains the data locality optimizations from commit 9115e8cd2a0c ("net: reorganize struct sock for better data locality") Note that all the instances of the sk_stamp accesses are either through the ioctl or the syscall recvmsg. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09sctp: initialize sin6_flowinfo for ipv6 addrs in sctp_inet6addr_eventXin Long
[ Upstream commit 4a2eb0c37b4759416996fbb4c45b932500cf06d3 ] syzbot reported a kernel-infoleak, which is caused by an uninitialized field(sin6_flowinfo) of addr->a.v6 in sctp_inet6addr_event(). The call trace is as below: BUG: KMSAN: kernel-infoleak in _copy_to_user+0x19a/0x230 lib/usercopy.c:33 CPU: 1 PID: 8164 Comm: syz-executor2 Not tainted 4.20.0-rc3+ #95 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x32d/0x480 lib/dump_stack.c:113 kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683 kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743 kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634 _copy_to_user+0x19a/0x230 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:183 [inline] sctp_getsockopt_local_addrs net/sctp/socket.c:5998 [inline] sctp_getsockopt+0x15248/0x186f0 net/sctp/socket.c:7477 sock_common_getsockopt+0x13f/0x180 net/core/sock.c:2937 __sys_getsockopt+0x489/0x550 net/socket.c:1939 __do_sys_getsockopt net/socket.c:1950 [inline] __se_sys_getsockopt+0xe1/0x100 net/socket.c:1947 __x64_sys_getsockopt+0x62/0x80 net/socket.c:1947 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 sin6_flowinfo is not really used by SCTP, so it will be fixed by simply setting it to 0. The issue exists since very beginning. Thanks Alexander for the reproducer provided. Reported-by: syzbot+ad5d327e6936a2e284be@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09packet: validate address length if non-zeroWillem de Bruijn
[ Upstream commit 6b8d95f1795c42161dc0984b6863e95d6acf24ed ] Validate packet socket address length if a length is given. Zero length is equivalent to not setting an address. Fixes: 99137b7888f4 ("packet: validate address length") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09packet: validate address lengthWillem de Bruijn
[ Upstream commit 99137b7888f4058087895d035d81c6b2d31015c5 ] Packet sockets with SOCK_DGRAM may pass an address for use in dev_hard_header. Ensure that it is of sufficient length. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09net/tls: allocate tls context using GFP_ATOMICGanesh Goudar
[ Upstream commit c6ec179a0082e2e76e3a72050c2b99d3d0f3da3f ] create_ctx can be called from atomic context, hence use GFP_ATOMIC instead of GFP_KERNEL. [ 395.962599] BUG: sleeping function called from invalid context at mm/slab.h:421 [ 395.979896] in_atomic(): 1, irqs_disabled(): 0, pid: 16254, name: openssl [ 395.996564] 2 locks held by openssl/16254: [ 396.010492] #0: 00000000347acb52 (sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.isra.44+0x13b/0x9a0 [ 396.029838] #1: 000000006c9552b5 (device_spinlock){+...}, at: tls_init+0x1d/0x280 [ 396.047675] CPU: 5 PID: 16254 Comm: openssl Tainted: G O 4.20.0-rc6+ #25 [ 396.066019] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0c 09/25/2017 [ 396.083537] Call Trace: [ 396.096265] dump_stack+0x5e/0x8b [ 396.109876] ___might_sleep+0x216/0x250 [ 396.123940] kmem_cache_alloc_trace+0x1b0/0x240 [ 396.138800] create_ctx+0x1f/0x60 [ 396.152504] tls_init+0xbd/0x280 [ 396.166135] tcp_set_ulp+0x191/0x2d0 [ 396.180035] ? tcp_set_ulp+0x2c/0x2d0 [ 396.193960] do_tcp_setsockopt.isra.44+0x148/0x9a0 [ 396.209013] __sys_setsockopt+0x7c/0xe0 [ 396.223054] __x64_sys_setsockopt+0x20/0x30 [ 396.237378] do_syscall_64+0x4a/0x180 [ 396.251200] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: df9d4a178022 ("net/tls: sleeping function from invalid context") Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09net/smc: fix TCP fallback socket releaseMyungho Jung
[ Upstream commit 78abe3d0dfad196959b1246003366e2610775ea6 ] clcsock can be released while kernel_accept() references it in TCP listen worker. Also, clcsock needs to wake up before released if TCP fallback is used and the clcsock is blocked by accept. Add a lock to safely release clcsock and call kernel_sock_shutdown() to wake up clcsock from accept in smc_release(). Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com Signed-off-by: Myungho Jung <mhjungk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09netrom: fix locking in nr_find_socket()Cong Wang
[ Upstream commit 7314f5480f3e37e570104dc5e0f28823ef849e72 ] nr_find_socket(), nr_find_peer() and nr_find_listener() lock the sock after finding it in the global list. However, the call path requires BH disabled for the sock lock consistently. Actually the locking is unnecessary at this point, we can just hold the sock refcnt to make sure it is not gone after we unlock the global list, and lock it later only when needed. Reported-and-tested-by: syzbot+f621cda8b7e598908efa@syzkaller.appspotmail.com Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09net: ipv4: do not handle duplicate fragments as overlappingMichal Kubecek
[ Upstream commit ade446403bfb79d3528d56071a84b15351a139ad ] Since commit 7969e5c40dfd ("ip: discard IPv4 datagrams with overlapping segments.") IPv4 reassembly code drops the whole queue whenever an overlapping fragment is received. However, the test is written in a way which detects duplicate fragments as overlapping so that in environments with many duplicate packets, fragmented packets may be undeliverable. Add an extra test and for (potentially) duplicate fragment, only drop the new fragment rather than the whole queue. Only starting offset and length are checked, not the contents of the fragments as that would be too expensive. For similar reason, linear list ("run") of a rbtree node is not iterated, we only check if the new fragment is a subset of the interval covered by existing consecutive fragments. v2: instead of an exact check iterating through linear list of an rbtree node, only check if the new fragment is subset of the "run" (suggested by Eric Dumazet) Fixes: 7969e5c40dfd ("ip: discard IPv4 datagrams with overlapping segments.") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09net: clear skb->tstamp in forwarding pathsEric Dumazet
[ Upstream commit 8203e2d844d34af247a151d8ebd68553a6e91785 ] Sergey reported that forwarding was no longer working if fq packet scheduler was used. This is caused by the recent switch to EDT model, since incoming packets might have been timestamped by __net_timestamp() __net_timestamp() uses ktime_get_real(), while fq expects packets using CLOCK_MONOTONIC base. The fix is to clear skb->tstamp in forwarding paths. Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.") Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sergey Matyukevich <geomatsi@gmail.com> Tested-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ip: validate header length on virtual device xmitWillem de Bruijn
[ Upstream commit cb9f1b783850b14cbd7f87d061d784a666dfba1f ] KMSAN detected read beyond end of buffer in vti and sit devices when passing truncated packets with PF_PACKET. The issue affects additional ip tunnel devices. Extend commit 76c0ddd8c3a6 ("ip6_tunnel: be careful when accessing the inner header") and commit ccfec9e5cb2d ("ip_tunnel: be careful when accessing the inner header"). Move the check to a separate helper and call at the start of each ndo_start_xmit function in net/ipv4 and net/ipv6. Minor changes: - convert dev_kfree_skb to kfree_skb on error path, as dev_kfree_skb calls consume_skb which is not for error paths. - use pskb_network_may_pull even though that is pedantic here, as the same as pskb_may_pull for devices without llheaders. - do not cache ipv6 hdrs if used only once (unsafe across pskb_may_pull, was more relevant to earlier patch) Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ipv6: tunnels: fix two use-after-freeEric Dumazet
[ Upstream commit cbb49697d5512ce9e61b45ce75d3ee43d7ea5524 ] xfrm6_policy_check() might have re-allocated skb->head, we need to reload ipv6 header pointer. sysbot reported : BUG: KASAN: use-after-free in __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40 Read of size 4 at addr ffff888191b8cb70 by task syz-executor2/1304 CPU: 0 PID: 1304 Comm: syz-executor2 Not tainted 4.20.0-rc7+ #356 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432 __ipv6_addr_type+0x302/0x32f net/ipv6/addrconf_core.c:40 ipv6_addr_type include/net/ipv6.h:403 [inline] ip6_tnl_get_cap+0x27/0x190 net/ipv6/ip6_tunnel.c:727 ip6_tnl_rcv_ctl+0xdb/0x2a0 net/ipv6/ip6_tunnel.c:757 vti6_rcv+0x336/0x8f3 net/ipv6/ip6_vti.c:321 xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132 ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394 ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434 NF_HOOK include/linux/netfilter.h:289 [inline] ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443 IPVS: ftp: loaded support on port[0] = 21 ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537 dst_input include/net/dst.h:450 [inline] ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76 NF_HOOK include/linux/netfilter.h:289 [inline] ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272 __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083 process_backlog+0x24e/0x7a0 net/core/dev.c:5923 napi_poll net/core/dev.c:6346 [inline] net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412 __do_softirq+0x308/0xb7e kernel/softirq.c:292 do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1027 </IRQ> do_softirq.part.14+0x126/0x160 kernel/softirq.c:337 do_softirq+0x19/0x20 kernel/softirq.c:340 netif_rx_ni+0x521/0x860 net/core/dev.c:4569 dev_loopback_xmit+0x287/0x8c0 net/core/dev.c:3576 NF_HOOK include/linux/netfilter.h:289 [inline] ip6_finish_output2+0x193a/0x2930 net/ipv6/ip6_output.c:84 ip6_fragment+0x2b06/0x3850 net/ipv6/ip6_output.c:727 ip6_finish_output+0x6b7/0xc50 net/ipv6/ip6_output.c:152 NF_HOOK_COND include/linux/netfilter.h:278 [inline] ip6_output+0x232/0x9d0 net/ipv6/ip6_output.c:171 dst_output include/net/dst.h:444 [inline] ip6_local_out+0xc5/0x1b0 net/ipv6/output_core.c:176 ip6_send_skb+0xbc/0x340 net/ipv6/ip6_output.c:1727 ip6_push_pending_frames+0xc5/0xf0 net/ipv6/ip6_output.c:1747 rawv6_push_pending_frames net/ipv6/raw.c:615 [inline] rawv6_sendmsg+0x3a3e/0x4b40 net/ipv6/raw.c:945 kobject: 'queues' (0000000089e6eea2): kobject_add_internal: parent: 'tunl0', set: '<NULL>' kobject: 'queues' (0000000089e6eea2): kobject_uevent_env inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 kobject: 'queues' (0000000089e6eea2): kobject_uevent_env: filter function caused the event to drop! sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 sock_write_iter+0x35e/0x5c0 net/socket.c:900 call_write_iter include/linux/fs.h:1857 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6b8/0x9f0 fs/read_write.c:487 kobject: 'rx-0' (00000000e2d902d9): kobject_add_internal: parent: 'queues', set: 'queues' kobject: 'rx-0' (00000000e2d902d9): kobject_uevent_env vfs_write+0x1fc/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 kobject: 'rx-0' (00000000e2d902d9): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/rx-0' __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 kobject: 'tx-0' (00000000443b70ac): kobject_add_internal: parent: 'queues', set: 'queues' entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457669 Code: fd b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f9bd200bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457669 RDX: 000000000000058f RSI: 00000000200033c0 RDI: 0000000000000003 kobject: 'tx-0' (00000000443b70ac): kobject_uevent_env RBP: 000000000072bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9bd200c6d4 R13: 00000000004c2dcc R14: 00000000004da398 R15: 00000000ffffffff Allocated by task 1304: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 __do_kmalloc_node mm/slab.c:3684 [inline] __kmalloc_node_track_caller+0x50/0x70 mm/slab.c:3698 __kmalloc_reserve.isra.41+0x41/0xe0 net/core/skbuff.c:140 __alloc_skb+0x155/0x760 net/core/skbuff.c:208 kobject: 'tx-0' (00000000443b70ac): fill_kobj_path: path = '/devices/virtual/net/tunl0/queues/tx-0' alloc_skb include/linux/skbuff.h:1011 [inline] __ip6_append_data.isra.49+0x2f1a/0x3f50 net/ipv6/ip6_output.c:1450 ip6_append_data+0x1bc/0x2d0 net/ipv6/ip6_output.c:1619 rawv6_sendmsg+0x15ab/0x4b40 net/ipv6/raw.c:938 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2116 __sys_sendmsg+0x11d/0x280 net/socket.c:2154 __do_sys_sendmsg net/socket.c:2163 [inline] __se_sys_sendmsg net/socket.c:2161 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2161 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe kobject: 'gre0' (00000000cb1b2d7b): kobject_add_internal: parent: 'net', set: 'devices' Freed by task 1304: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kfree+0xcf/0x230 mm/slab.c:3817 skb_free_head+0x93/0xb0 net/core/skbuff.c:553 pskb_expand_head+0x3b2/0x10d0 net/core/skbuff.c:1498 __pskb_pull_tail+0x156/0x18a0 net/core/skbuff.c:1896 pskb_may_pull include/linux/skbuff.h:2188 [inline] _decode_session6+0xd11/0x14d0 net/ipv6/xfrm6_policy.c:150 __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:3272 kobject: 'gre0' (00000000cb1b2d7b): kobject_uevent_env __xfrm_policy_check+0x380/0x2c40 net/xfrm/xfrm_policy.c:3322 __xfrm_policy_check2 include/net/xfrm.h:1170 [inline] xfrm_policy_check include/net/xfrm.h:1175 [inline] xfrm6_policy_check include/net/xfrm.h:1185 [inline] vti6_rcv+0x4bd/0x8f3 net/ipv6/ip6_vti.c:316 xfrm6_ipcomp_rcv+0x1a5/0x3a0 net/ipv6/xfrm6_protocol.c:132 ip6_protocol_deliver_rcu+0x372/0x1940 net/ipv6/ip6_input.c:394 ip6_input_finish+0x84/0x170 net/ipv6/ip6_input.c:434 NF_HOOK include/linux/netfilter.h:289 [inline] ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:443 ip6_mc_input+0x514/0x11c0 net/ipv6/ip6_input.c:537 dst_input include/net/dst.h:450 [inline] ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76 NF_HOOK include/linux/netfilter.h:289 [inline] ipv6_rcv+0x115/0x640 net/ipv6/ip6_input.c:272 __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4973 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5083 process_backlog+0x24e/0x7a0 net/core/dev.c:5923 kobject: 'gre0' (00000000cb1b2d7b): fill_kobj_path: path = '/devices/virtual/net/gre0' napi_poll net/core/dev.c:6346 [inline] net_rx_action+0x7fa/0x19b0 net/core/dev.c:6412 __do_softirq+0x308/0xb7e kernel/softirq.c:292 The buggy address belongs to the object at ffff888191b8cac0 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 176 bytes inside of 512-byte region [ffff888191b8cac0, ffff888191b8ccc0) The buggy address belongs to the page: page:ffffea000646e300 count:1 mapcount:0 mapping:ffff8881da800940 index:0x0 flags: 0x2fffc0000000200(slab) raw: 02fffc0000000200 ffffea0006eaaa48 ffffea00065356c8 ffff8881da800940 raw: 0000000000000000 ffff888191b8c0c0 0000000100000006 0000000000000000 page dumped because: kasan: bad access detected kobject: 'queues' (000000005fd6226e): kobject_add_internal: parent: 'gre0', set: '<NULL>' Memory state around the buggy address: ffff888191b8ca00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888191b8ca80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb >ffff888191b8cb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888191b8cb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888191b8cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 0d3c703a9d17 ("ipv6: Cleanup IPv6 tunnel receive path") Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ipv6: explicitly initialize udp6_addr in udp_sock_create6()Cong Wang
[ Upstream commit fb24274546310872eeeaf3d1d53799d8414aa0f2 ] syzbot reported the use of uninitialized udp6_addr::sin6_scope_id. We can just set ::sin6_scope_id to zero, as tunnels are unlikely to use an IPv6 address that needs a scope id and there is no interface to bind in this context. For net-next, it looks different as we have cfg->bind_ifindex there so we can probably call ipv6_iface_scope_id(). Same for ::sin6_flowinfo, tunnels don't use it. Fixes: 8024e02879dd ("udp: Add udp_sock_create for UDP tunnels to open listener socket") Reported-by: syzbot+c56449ed3652e6720f30@syzkaller.appspotmail.com Cc: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ipv4: Fix potential Spectre v1 vulnerabilityGustavo A. R. Silva
[ Upstream commit 5648451e30a0d13d11796574919a359025d52cce ] vr.vifi is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/ipv4/ipmr.c:1616 ipmr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap) net/ipv4/ipmr.c:1690 ipmr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap) Fix this by sanitizing vr.vifi before using it to index mrt->vif_table' Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ip6mr: Fix potential Spectre v1 vulnerabilityGustavo A. R. Silva
[ Upstream commit 69d2c86766da2ded2b70281f1bf242cb0d58a778 ] vr.mifi is indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/ipv6/ip6mr.c:1845 ip6mr_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap) net/ipv6/ip6mr.c:1919 ip6mr_compat_ioctl() warn: potential spectre issue 'mrt->vif_table' [r] (local cap) Fix this by sanitizing vr.mifi before using it to index mrt->vif_table' Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ieee802154: lowpan_header_create check must check daddrWillem de Bruijn
[ Upstream commit 40c3ff6d5e0809505a067dd423c110c5658c478c ] Packet sockets may call dev_header_parse with NULL daddr. Make lowpan_header_ops.create fail. Fixes: 87a93e4eceb4 ("ieee802154: change needed headroom/tailroom") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09gro_cell: add napi_disable in gro_cells_destroyLorenzo Bianconi
[ Upstream commit 8e1da73acded4751a93d4166458a7e640f37d26c ] Add napi_disable routine in gro_cells_destroy since starting from commit c42858eaf492 ("gro_cells: remove spinlock protecting receive queues") gro_cell_poll and gro_cells_destroy can run concurrently on napi_skbs list producing a kernel Oops if the tunnel interface is removed while gro_cell_poll is running. The following Oops has been triggered removing a vxlan device while the interface is receiving traffic [ 5628.948853] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 5628.949981] PGD 0 P4D 0 [ 5628.950308] Oops: 0002 [#1] SMP PTI [ 5628.950748] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.20.0-rc6+ #41 [ 5628.952940] RIP: 0010:gro_cell_poll+0x49/0x80 [ 5628.955615] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202 [ 5628.956250] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000 [ 5628.957102] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150 [ 5628.957940] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000 [ 5628.958803] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040 [ 5628.959661] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040 [ 5628.960682] FS: 0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000 [ 5628.961616] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5628.962359] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0 [ 5628.963188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5628.964034] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5628.964871] Call Trace: [ 5628.965179] net_rx_action+0xf0/0x380 [ 5628.965637] __do_softirq+0xc7/0x431 [ 5628.966510] run_ksoftirqd+0x24/0x30 [ 5628.966957] smpboot_thread_fn+0xc5/0x160 [ 5628.967436] kthread+0x113/0x130 [ 5628.968283] ret_from_fork+0x3a/0x50 [ 5628.968721] Modules linked in: [ 5628.969099] CR2: 0000000000000008 [ 5628.969510] ---[ end trace 9d9dedc7181661fe ]--- [ 5628.970073] RIP: 0010:gro_cell_poll+0x49/0x80 [ 5628.972965] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202 [ 5628.973611] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000 [ 5628.974504] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150 [ 5628.975462] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000 [ 5628.976413] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040 [ 5628.977375] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040 [ 5628.978296] FS: 0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000 [ 5628.979327] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5628.980044] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0 [ 5628.980929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5628.981736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5628.982409] Kernel panic - not syncing: Fatal exception in interrupt [ 5628.983307] Kernel Offset: disabled Fixes: c42858eaf492 ("gro_cells: remove spinlock protecting receive queues") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-09ax25: fix a use-after-free in ax25_fillin_cb()Cong Wang
[ Upstream commit c433570458e49bccea5c551df628d058b3526289 ] There are multiple issues here: 1. After freeing dev->ax25_ptr, we need to set it to NULL otherwise we may use a dangling pointer. 2. There is a race between ax25_setsockopt() and device notifier as reported by syzbot. Close it by holding RTNL lock. 3. We need to test if dev->ax25_ptr is NULL before using it. Reported-and-tested-by: syzbot+ae6bb869cbed29b29040@syzkaller.appspotmail.com Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-29xfrm_user: fix freeing of xfrm states on acquireMathias Krause
commit 4a135e538962cb00a9667c82e7d2b9e4d7cd7177 upstream. Commit 565f0fa902b6 ("xfrm: use a dedicated slab cache for struct xfrm_state") moved xfrm state objects to use their own slab cache. However, it missed to adapt xfrm_user to use this new cache when freeing xfrm states. Fix this by introducing and make use of a new helper for freeing xfrm_state objects. Fixes: 565f0fa902b6 ("xfrm: use a dedicated slab cache for struct xfrm_state") Reported-by: Pan Bian <bianpan2016@163.com> Cc: <stable@vger.kernel.org> # v4.18+ Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-21SUNRPC: Fix a potential race in xprt_connect()Trond Myklebust
[ Upstream commit 0a9a4304f3614e25d9de9b63502ca633c01c0d70 ] If an asynchronous connection attempt completes while another task is in xprt_connect(), then the call to rpc_sleep_on() could end up racing with the call to xprt_wake_pending_tasks(). So add a second test of the connection state after we've put the task to sleep and set the XPRT_CONNECTING flag, when we know that there can be no asynchronous connection attempts still in progress. Fixes: 0b9e79431377d ("SUNRPC: Move the test for XPRT_CONNECTING into...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17tcp: lack of available data can also cause TSO deferEric Dumazet
commit f9bfe4e6a9d08d405fe7b081ee9a13e649c97ecf upstream. tcp_tso_should_defer() can return true in three different cases : 1) We are cwnd-limited 2) We are rwnd-limited 3) We are application limited. Neal pointed out that my recent fix went too far, since it assumed that if we were not in 1) case, we must be rwnd-limited Fix this by properly populating the is_cwnd_limited and is_rwnd_limited booleans. After this change, we can finally move the silly check for FIN flag only for the application-limited case. The same move for EOR bit will be handled in net-next, since commit 1c09f7d073b1 ("tcp: do not try to defer skbs with eor mark (MSG_EOR)") is scheduled for linux-4.21 Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100 and checking none of them was rwnd_limited in the chrono_stat output from "ss -ti" command. Fixes: 41727549de3e ("tcp: Do not underestimate rwnd_limited") Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-17netfilter: nf_tables: deactivate expressions in rule replecement routineTaehee Yoo
[ Upstream commit ca08987885a147643817d02bf260bc4756ce8cd4 ] There is no expression deactivation call from the rule replacement path, hence, chain counter is not decremented. A few steps to reproduce the problem: %nft add table ip filter %nft add chain ip filter c1 %nft add chain ip filter c1 %nft add rule ip filter c1 jump c2 %nft replace rule ip filter c1 handle 3 accept %nft flush ruleset <jump c2> expression means immediate NFT_JUMP to chain c2. Reference count of chain c2 is increased when the rule is added. When rule is deleted or replaced, the reference counter of c2 should be decreased via nft_rule_expr_deactivate() which calls nft_immediate_deactivate(). Splat looks like: [ 214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Modules linked in: nf_tables nfnetlink [ 214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44 [ 214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables] [ 214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables] [ 214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8 [ 214.398983] RSP: 0018:ffff8881152874e8 EFLAGS: 00010202 [ 214.398983] RAX: 0000000000000001 RBX: ffff88810ef9fc28 RCX: ffff8881152876f0 [ 214.398983] RDX: dffffc0000000000 RSI: 1ffff11022a50ede RDI: ffff88810ef9fc78 [ 214.398983] RBP: 1ffff11022a50e9d R08: 0000000080000000 R09: 0000000000000000 [ 214.398983] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff11022a50eba [ 214.398983] R13: ffff888114446e08 R14: ffff8881152876f0 R15: ffffed1022a50ed6 [ 214.398983] FS: 0000000000000000(0000) GS:ffff888116400000(0000) knlGS:0000000000000000 [ 214.398983] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 214.398983] CR2: 00007fab9bb5f868 CR3: 000000012aa16000 CR4: 00000000001006e0 [ 214.398983] Call Trace: [ 214.398983] ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables] [ 214.398983] ? __kasan_slab_free+0x145/0x180 [ 214.398983] ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables] [ 214.398983] ? kfree+0xdb/0x280 [ 214.398983] nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables] [ ... ] Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions") Reported by: Christoph Anton Mitterer <calestyo@scientia.net> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505 Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791 Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-17netfilter: nf_conncount: remove wrong condition check routineTaehee Yoo
[ Upstream commit 53ca0f2fec39c80ccd19e6e3f30cc8daef174b70 ] All lists that reach the tree_nodes_free() function have both zero counter and true dead flag. The reason for this is that lists to be release are selected by nf_conncount_gc_list() which already decrements the list counter and sets on the dead flag. Therefore, this if statement in tree_nodes_free() is unnecessary and wrong. Fixes: 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>