aboutsummaryrefslogtreecommitdiffstats
path: root/net/rxrpc
AgeCommit message (Collapse)Author
2024-02-23rxrpc: Fix response to PING RESPONSE ACKs to a dead callDavid Howells
[ Upstream commit 6f769f22822aa4124b556339781b04d810f0e038 ] Stop rxrpc from sending a DUP ACK in response to a PING RESPONSE ACK on a dead call. We may have initiated the ping but the call may have beaten the response to completion. Fixes: 18bfeba50dfd ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-23rxrpc_find_service_conn_rcu: fix the usage of read_seqbegin_or_lock()Oleg Nesterov
[ Upstream commit bad1a11c0f061aa073bab785389fe04f19ba02e1 ] rxrpc_find_service_conn_rcu() should make the "seq" counter odd on the second pass, otherwise read_seqbegin_or_lock() never takes the lock. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20231117164846.GA10410@redhat.com/ Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17rxrpc: Fix hard call timeout unitsDavid Howells
[ Upstream commit 0d098d83c5d9e107b2df7f5e11f81492f56d2fe7 ] The hard call timeout is specified in the RXRPC_SET_CALL_TIMEOUT cmsg in seconds, so fix the point at which sendmsg() applies it to the call to convert to jiffies from seconds, not milliseconds. Fixes: a158bdd3247b ("rxrpc: Fix timeout of a call that hasn't yet been granted a channel") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18rxrpc: Fix missing unlock in rxrpc_do_sendmsg()David Howells
[ Upstream commit 4feb2c44629e6f9b459b41a5a60491069d346a95 ] One of the error paths in rxrpc_do_sendmsg() doesn't unlock the call mutex before returning. Fix it to do this. Note that this still doesn't get rid of the checker warning: ../net/rxrpc/sendmsg.c:617:5: warning: context imbalance in 'rxrpc_do_sendmsg' - wrong count at exit I think the interplay between the socket lock and the call's user_mutex may be too complicated for checker to analyse, especially as rxrpc_new_client_call_for_sendmsg(), which it calls, returns with the call's user_mutex if successful but unconditionally drops the socket lock. Fixes: e754eba685aa ("rxrpc: Provide a cmsg to specify the amount of Tx data for a call") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18rxrpc: Fix ack.bufferSize to be 0 when generating an ackDavid Howells
[ Upstream commit 8889a711f9b4dcf4dd1330fa493081beebd118c9 ] ack.bufferSize should be set to 0 when generating an ack. Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs") Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28rxrpc: Fix calc of resend ageDavid Howells
[ Upstream commit 214a9dc7d852216e83acac7b75bc18f01ce184c2 ] Fix the calculation of the resend age to add a microsecond value as microseconds, not nanoseconds. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28rxrpc: Fix local destruction being repeatedDavid Howells
[ Upstream commit d3d863036d688313f8d566b87acd7d99daf82749 ] If the local processor work item for the rxrpc local endpoint gets requeued by an event (such as an incoming packet) between it getting scheduled for destruction and the UDP socket being closed, the rxrpc_local_destroyer() function can get run twice. The second time it can hang because it can end up waiting for cleanup events that will never happen. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15rxrpc: Fix an insufficiently large sglist in rxkad_verify_packet_2()David Howells
[ Upstream commit 0d40f728e28393a8817d1fcae923dfa3409e488c ] rxkad_verify_packet_2() has a small stack-allocated sglist of 4 elements, but if that isn't sufficient for the number of fragments in the socket buffer, we try to allocate an sglist large enough to hold all the fragments. However, for large packets with a lot of fragments, this isn't sufficient and we need at least one additional fragment. The problem manifests as skb_to_sgvec() returning -EMSGSIZE and this then getting returned by userspace. Most of the time, this isn't a problem as rxrpc sets a limit of 5692, big enough for 4 jumbo subpackets to be glued together; occasionally, however, the server will ignore the reported limit and give a packet that's a lot bigger - say 19852 bytes with ->nr_frags being 7. skb_to_sgvec() then tries to return a "zeroth" fragment that seems to occur before the fragments counted by ->nr_frags and we hit the end of the sglist too early. Note that __skb_to_sgvec() also has an skb_walk_frags() loop that is recursive up to 24 deep. I'm not sure if I need to take account of that too - or if there's an easy way of counting those frags too. Fix this by counting an extra frag and allocating a larger sglist based on that. Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Fix decision on when to generate an IDLE ACKDavid Howells
[ Upstream commit 9a3dedcf18096e8f7f22b8777d78c4acfdea1651 ] Fix the decision on when to generate an IDLE ACK by keeping a count of the number of packets we've received, but not yet soft-ACK'd, and the number of packets we've processed, but not yet hard-ACK'd, rather than trying to keep track of which DATA sequence numbers correspond to those points. We then generate an ACK when either counter exceeds 2. The counters are both cleared when we transcribe the information into any sort of ACK packet for transmission. IDLE and DELAY ACKs are skipped if both counters are 0 (ie. no change). Fixes: 805b21b929e2 ("rxrpc: Send an ACK after every few DATA packets we receive") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Don't let ack.previousPacket regressDavid Howells
[ Upstream commit 81524b6312535897707f2942695da1d359a5e56b ] The previousPacket field in the rx ACK packet should never go backwards - it's now the highest DATA sequence number received, not the last on received (it used to be used for out of sequence detection). Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Fix overlapping ACK accountingDavid Howells
[ Upstream commit 8940ba3cfe4841928777fd45eaa92051522c7f0c ] Fix accidental overlapping of Rx-phase ACK accounting with Tx-phase ACK accounting through variables shared between the two. call->acks_* members refer to ACKs received in the Tx phase and call->ackr_* members to ACKs sent/to be sent during the Rx phase. Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks") Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeffrey Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Don't try to resend the request if we're receiving the replyDavid Howells
[ Upstream commit 114af61f88fbe34d641b13922d098ffec4c1be1b ] rxrpc has a timer to trigger resending of unacked data packets in a call. This is not cancelled when a client call switches to the receive phase on the basis that most calls don't last long enough for it to ever expire. However, if it *does* expire after we've started to receive the reply, we shouldn't then go into trying to retransmit or pinging the server to find out if an ack got lost. Fix this by skipping the resend code if we're into receiving the reply to a client call. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Fix listen() setting the bar too high for the prealloc ringsDavid Howells
[ Upstream commit 88e22159750b0d55793302eeed8ee603f5c1a95c ] AF_RXRPC's listen() handler lets you set the backlog up to 32 (if you bump up the sysctl), but whilst the preallocation circular buffers have 32 slots in them, one of them has to be a dead slot because we're using CIRC_CNT(). This means that listen(rxrpc_sock, 32) will cause an oops when the socket is closed because rxrpc_service_prealloc_one() allocated one too many calls and rxrpc_discard_prealloc() won't then be able to get rid of them because it'll think the ring is empty. rxrpc_release_calls_on_socket() then tries to abort them, but oopses because call->peer isn't yet set. Fix this by setting the maximum backlog to RXRPC_BACKLOG_MAX - 1 to match the ring capacity. BUG: kernel NULL pointer dereference, address: 0000000000000086 ... RIP: 0010:rxrpc_send_abort_packet+0x73/0x240 [rxrpc] Call Trace: <TASK> ? __wake_up_common_lock+0x7a/0x90 ? rxrpc_notify_socket+0x8e/0x140 [rxrpc] ? rxrpc_abort_call+0x4c/0x60 [rxrpc] rxrpc_release_calls_on_socket+0x107/0x1a0 [rxrpc] rxrpc_release+0xc9/0x1c0 [rxrpc] __sock_release+0x37/0xa0 sock_close+0x11/0x20 __fput+0x89/0x240 task_work_run+0x59/0x90 do_exit+0x319/0xaa0 Fixes: 00e907127e6f ("rxrpc: Preallocate peers, conns and calls for incoming service requests") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: https://lists.infradead.org/pipermail/linux-afs/2022-March/005079.html Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14rxrpc: Return an error to sendmsg if call failedDavid Howells
[ Upstream commit 4ba68c5192554876bd8c3afd904e3064d2915341 ] If at the end of rxrpc sendmsg() or rxrpc_kernel_send_data() the call that was being given data was aborted remotely or otherwise failed, return an error rather than returning the amount of data buffered for transmission. The call (presumably) did not complete, so there's not much point continuing with it. AF_RXRPC considers it "complete" and so will be unwilling to do anything else with it - and won't send a notification for it, deeming the return from sendmsg sufficient. Not returning an error causes afs to incorrectly handle a StoreData operation that gets interrupted by a change of address due to NAT reconfiguration. This doesn't normally affect most operations since their request parameters tend to fit into a single UDP packet and afs_make_call() returns before the server responds; StoreData is different as it involves transmission of a lot of data. This can be triggered on a client by doing something like: dd if=/dev/zero of=/afs/example.com/foo bs=1M count=512 at one prompt, and then changing the network address at another prompt, e.g.: ifconfig enp6s0 inet 192.168.6.2 && route add 192.168.6.1 dev enp6s0 Tracing packets on an Auristor fileserver looks something like: 192.168.6.1 -> 192.168.6.3 RX 107 ACK Idle Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001 192.168.6.3 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(64538) (64538) 192.168.6.3 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(64538) (64538) 192.168.6.1 -> 192.168.6.3 RX 107 ACK Idle Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001 <ARP exchange for 192.168.6.2> 192.168.6.2 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(0) (0) 192.168.6.2 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(0) (0) 192.168.6.1 -> 192.168.6.2 RX 107 ACK Exceeds Window Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001 192.168.6.1 -> 192.168.6.2 RX 74 ABORT Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001 192.168.6.1 -> 192.168.6.2 RX 74 ABORT Seq: 29321 Call: 4 Source Port: 7000 Destination Port: 7001 The Auristor fileserver logs code -453 (RXGEN_SS_UNMARSHAL), but the abort code received by kafs is -5 (RX_PROTOCOL_ERROR) as the rx layer sees the condition and generates an abort first and the unmarshal error is a consequence of that at the application layer. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004810.html # v1 Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-27rxrpc: Restore removed timer deletionDavid Howells
[ Upstream commit ee3b0826b4764f6c13ad6db67495c5a1c38e9025 ] A recent patch[1] from Eric Dumazet flipped the order in which the keepalive timer and the keepalive worker were cancelled in order to fix a syzbot reported issue[2]. Unfortunately, this enables the mirror image bug whereby the timer races with rxrpc_exit_net(), restarting the worker after it has been cancelled: CPU 1 CPU 2 =============== ===================== if (rxnet->live) <INTERRUPT> rxnet->live = false; cancel_work_sync(&rxnet->peer_keepalive_work); rxrpc_queue_work(&rxnet->peer_keepalive_work); del_timer_sync(&rxnet->peer_keepalive_timer); Fix this by restoring the removed del_timer_sync() so that we try to remove the timer twice. If the timer runs again, it should see ->live == false and not restart the worker. Fixes: 1946014ca3b1 ("rxrpc: fix a race in rxrpc_exit_net()") Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Dumazet <edumazet@google.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20220404183439.3537837-1-eric.dumazet@gmail.com/ [1] Link: https://syzkaller.appspot.com/bug?extid=724378c4bb58f703b09a [2] Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-15rxrpc: fix a race in rxrpc_exit_net()Eric Dumazet
[ Upstream commit 1946014ca3b19be9e485e780e862c375c6f98bad ] Current code can lead to the following race: CPU0 CPU1 rxrpc_exit_net() rxrpc_peer_keepalive_worker() if (rxnet->live) rxnet->live = false; del_timer_sync(&rxnet->peer_keepalive_timer); timer_reduce(&rxnet->peer_keepalive_timer, jiffies + delay); cancel_work_sync(&rxnet->peer_keepalive_work); rxrpc_exit_net() exits while peer_keepalive_timer is still armed, leading to use-after-free. syzbot report was: ODEBUG: free active (active state 0) object type: timer_list hint: rxrpc_peer_keepalive_timeout+0x0/0xb0 WARNING: CPU: 0 PID: 3660 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Modules linked in: CPU: 0 PID: 3660 Comm: kworker/u4:6 Not tainted 5.17.0-syzkaller-13993-g88e6c0207623 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505 Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd 00 1c 26 8a 4c 89 ee 48 c7 c7 00 10 26 8a e8 b1 e7 28 05 <0f> 0b 83 05 15 eb c5 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3 RSP: 0018:ffffc9000353fb00 EFLAGS: 00010082 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 RDX: ffff888029196140 RSI: ffffffff815efad8 RDI: fffff520006a7f52 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815ea4ae R11: 0000000000000000 R12: ffffffff89ce23e0 R13: ffffffff8a2614e0 R14: ffffffff816628c0 R15: dffffc0000000000 FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe1f2908924 CR3: 0000000043720000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __debug_check_no_obj_freed lib/debugobjects.c:992 [inline] debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1023 kfree+0xd6/0x310 mm/slab.c:3809 ops_free_list.part.0+0x119/0x370 net/core/net_namespace.c:176 ops_free_list net/core/net_namespace.c:174 [inline] cleanup_net+0x591/0xb00 net/core/net_namespace.c:598 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 </TASK> Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Howells <dhowells@redhat.com> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: linux-afs@lists.infradead.org Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-01rxrpc: Adjust retransmission backoffDavid Howells
[ Upstream commit 2c13c05c5ff4b9fc907b07f7311821910ebaaf8a ] Improve retransmission backoff by only backing off when we retransmit data packets rather than when we set the lost ack timer. To this end: (1) In rxrpc_resend(), use rxrpc_get_rto_backoff() when setting the retransmission timer and only tell it that we are retransmitting if we actually have things to retransmit. Note that it's possible for the retransmission algorithm to race with the processing of a received ACK, so we may see no packets needing retransmission. (2) In rxrpc_send_data_packet(), don't bump the backoff when setting the ack_lost_at timer, as it may then get bumped twice. With this, when looking at one particular packet, the retransmission intervals were seen to be 1.5ms, 2ms, 3ms, 5ms, 9ms, 17ms, 33ms, 71ms, 136ms, 264ms, 544ms, 1.088s, 2.1s, 4.2s and 8.3s. Fixes: c410bf01933e ("rxrpc: Fix the excessive initial retransmission timeout") Suggested-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/164138117069.2023386.17446904856843997127.stgit@warthog.procyon.org.uk/ Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-08rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()Eiichi Tsukata
commit beacff50edbd6c9659a6f15fc7f6126909fade29 upstream. Need to call rxrpc_put_local() for peer candidate before kfree() as it holds a ref to rxrpc_local. [DH: v2: Changed to abstract the peer freeing code out into a function] Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/all/20211121041608.133740-2-eiichi.tsukata@nutanix.com/ # v1 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()Jiasheng Jiang
[ Upstream commit acde891c243c1ed85b19d4d5042bdf00914f5739 ] Directly using _usecs_to_jiffies() might be unsafe, so it's better to use usecs_to_jiffies() instead. Because we can see that the result of _usecs_to_jiffies() could be larger than MAX_JIFFY_OFFSET values without the check of the input. Fixes: c410bf01933e ("Fix the excessive initial retransmission timeout") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-17rxrpc: Fix clearance of Tx/Rx ring when releasing a callDavid Howells
commit 7b5eab57cac45e270a0ad624ba157c5b30b3d44d upstream. At the end of rxrpc_release_call(), rxrpc_cleanup_ring() is called to clear the Rx/Tx skbuff ring, but this doesn't lock the ring whilst it's accessing it. Unfortunately, rxrpc_resend() might be trying to retransmit a packet concurrently with this - and whilst it does lock the ring, this isn't protection against rxrpc_cleanup_call(). Fix this by removing the call to rxrpc_cleanup_ring() from rxrpc_release_call(). rxrpc_cleanup_ring() will be called again anyway from rxrpc_cleanup_call(). The earlier call is just an optimisation to recycle skbuffs more quickly. Alternative solutions include rxrpc_release_call() could try to cancel the work item or wait for it to complete or rxrpc_cleanup_ring() could lock when accessing the ring (which would require a bh lock). This can produce a report like the following: BUG: KASAN: use-after-free in rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372 Read of size 4 at addr ffff888011606e04 by task kworker/0:0/5 ... Workqueue: krxrpcd rxrpc_process_call Call Trace: ... kasan_report.cold+0x79/0xd5 mm/kasan/report.c:413 rxrpc_send_data_packet+0x19b4/0x1e70 net/rxrpc/output.c:372 rxrpc_resend net/rxrpc/call_event.c:266 [inline] rxrpc_process_call+0x1634/0x1f60 net/rxrpc/call_event.c:412 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275 ... Allocated by task 2318: ... sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2348 rxrpc_send_data+0xb51/0x2bf0 net/rxrpc/sendmsg.c:358 rxrpc_do_sendmsg+0xc03/0x1350 net/rxrpc/sendmsg.c:744 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:560 ... Freed by task 2318: ... kfree_skb+0x140/0x3f0 net/core/skbuff.c:704 rxrpc_free_skb+0x11d/0x150 net/rxrpc/skbuff.c:78 rxrpc_cleanup_ring net/rxrpc/call_object.c:485 [inline] rxrpc_release_call+0x5dd/0x860 net/rxrpc/call_object.c:552 rxrpc_release_calls_on_socket+0x21c/0x300 net/rxrpc/call_object.c:579 rxrpc_release_sock net/rxrpc/af_rxrpc.c:885 [inline] rxrpc_release+0x263/0x5a0 net/rxrpc/af_rxrpc.c:916 __sock_release+0xcd/0x280 net/socket.c:597 ... The buggy address belongs to the object at ffff888011606dc0 which belongs to the cache skbuff_head_cache of size 232 Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: syzbot+174de899852504e4a74a@syzkaller.appspotmail.com Reported-by: syzbot+3d1c772efafd3c38d007@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Hillf Danton <hdanton@sina.com> Link: https://lore.kernel.org/r/161234207610.653119.5287360098400436976.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-02-10rxrpc: Fix deadlock around release of dst cached on udp tunnelDavid Howells
[ Upstream commit 5399d52233c47905bbf97dcbaa2d7a9cc31670ba ] AF_RXRPC sockets use UDP ports in encap mode. This causes socket and dst from an incoming packet to get stolen and attached to the UDP socket from whence it is leaked when that socket is closed. When a network namespace is removed, the wait for dst records to be cleaned up happens before the cleanup of the rxrpc and UDP socket, meaning that the wait never finishes. Fix this by moving the rxrpc (and, by dependence, the afs) private per-network namespace registrations to the device group rather than subsys group. This allows cached rxrpc local endpoints to be cleared and their UDP sockets closed before we try waiting for the dst records. The symptom is that lines looking like the following: unregister_netdevice: waiting for lo to become free get emitted at regular intervals after running something like the referenced syzbot test. Thanks to Vadim for tracking this down and work out the fix. Reported-by: syzbot+df400f2f24a1677cd7e0@syzkaller.appspotmail.com Reported-by: Vadim Fedorenko <vfedorenko@novek.ru> Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook") Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Vadim Fedorenko <vfedorenko@novek.ru> Link: https://lore.kernel.org/r/161196443016.3868642.5577440140646403533.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-03rxrpc: Fix memory leak in rxrpc_lookup_localTakeshi Misawa
commit b8323f7288abd71794cd7b11a4c0a38b8637c8b5 upstream. Commit 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Then release ref in __rxrpc_put_peer and rxrpc_put_peer_locked. struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp) - peer->local = local; + peer->local = rxrpc_get_local(local); rxrpc_discard_prealloc also need ref release in discarding. syzbot report: BUG: memory leak unreferenced object 0xffff8881080ddc00 (size 256): comm "syz-executor339", pid 8462, jiffies 4294942238 (age 12.350s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 0a 00 00 00 00 c0 00 08 81 88 ff ff ................ backtrace: [<000000002b6e495f>] kmalloc include/linux/slab.h:552 [inline] [<000000002b6e495f>] kzalloc include/linux/slab.h:682 [inline] [<000000002b6e495f>] rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline] [<000000002b6e495f>] rxrpc_lookup_local+0x1c1/0x760 net/rxrpc/local_object.c:244 [<000000006b43a77b>] rxrpc_bind+0x174/0x240 net/rxrpc/af_rxrpc.c:149 [<00000000fd447a55>] afs_open_socket+0xdb/0x200 fs/afs/rxrpc.c:64 [<000000007fd8867c>] afs_net_init+0x2b4/0x340 fs/afs/main.c:126 [<0000000063d80ec1>] ops_init+0x4e/0x190 net/core/net_namespace.c:152 [<00000000073c5efa>] setup_net+0xde/0x2d0 net/core/net_namespace.c:342 [<00000000a6744d5b>] copy_net_ns+0x19f/0x3e0 net/core/net_namespace.c:483 [<0000000017d3aec3>] create_new_namespaces+0x199/0x4f0 kernel/nsproxy.c:110 [<00000000186271ef>] unshare_nsproxy_namespaces+0x9b/0x120 kernel/nsproxy.c:226 [<000000002de7bac4>] ksys_unshare+0x2fe/0x5c0 kernel/fork.c:2957 [<00000000349b12ba>] __do_sys_unshare kernel/fork.c:3025 [inline] [<00000000349b12ba>] __se_sys_unshare kernel/fork.c:3023 [inline] [<00000000349b12ba>] __x64_sys_unshare+0x12/0x20 kernel/fork.c:3023 [<000000006d178ef7>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 [<00000000637076d4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com> Reported-and-tested-by: syzbot+305326672fed51b205f7@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/161183091692.3506637.3206605651502458810.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-23rxrpc: Fix handling of an unsupported token type in rxrpc_read()David Howells
[ Upstream commit d52e419ac8b50c8bef41b398ed13528e75d7ad48 ] Clang static analysis reports the following: net/rxrpc/key.c:657:11: warning: Assigned value is garbage or undefined toksize = toksizes[tok++]; ^ ~~~~~~~~~~~~~~~ rxrpc_read() contains two consecutive loops. The first loop calculates the token sizes and stores the results in toksizes[] and the second one uses the array. When there is an error in identifying the token in the first loop, the token is skipped, no change is made to the toksizes[] array. When the same error happens in the second loop, the token is not skipped. This will cause the toksizes[] array to be out of step and will overrun past the calculated sizes. Fix this by making both loops log a message and return an error in this case. This should only happen if a new token type is incompletely implemented, so it should normally be impossible to trigger this. Fixes: 9a059cd5ca7d ("rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()") Reported-by: Tom Rix <trix@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/161046503122.2445787.16714129930607546635.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-23rxrpc: Call state should be read with READ_ONCE() under some circumstancesBaptiste Lepers
[ Upstream commit a95d25dd7b94a5ba18246da09b4218f132fed60e ] The call state may be changed at any time by the data-ready routine in response to received packets, so if the call state is to be read and acted upon several times in a function, READ_ONCE() must be used unless the call state lock is held. As it happens, we used READ_ONCE() to read the state a few lines above the unmarked read in rxrpc_input_data(), so use that value rather than re-reading it. Fixes: a158bdd3247b ("rxrpc: Fix call timeouts") Signed-off-by: Baptiste Lepers <baptiste.lepers@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/161046715522.2450566.488819910256264150.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-14rxrpc: Fix server keyring leakDavid Howells
[ Upstream commit 38b1dc47a35ba14c3f4472138ea56d014c2d609b ] If someone calls setsockopt() twice to set a server key keyring, the first keyring is leaked. Fix it to return an error instead if the server key keyring is already set. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-14rxrpc: The server keyring isn't network-namespacedDavid Howells
[ Upstream commit fea99111244bae44e7d82a973744d27ea1567814 ] The keyring containing the server's tokens isn't network-namespaced, so it shouldn't be looked up with a network namespace. It is expected to be owned specifically by the server, so namespacing is unnecessary. Fixes: a58946c158a0 ("keys: Pass the network namespace into request_key mechanism") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-14rxrpc: Fix some missing _bh annotations on locking conn->state_lockDavid Howells
[ Upstream commit fa1d113a0f96f9ab7e4fe4f8825753ba1e34a9d3 ] conn->state_lock may be taken in softirq mode, but a previous patch replaced an outer lock in the response-packet event handling code, and lost the _bh from that when doing so. Fix this by applying the _bh annotation to the state_lock locking. Fixes: a1399f8bb033 ("rxrpc: Call channels should have separate call number spaces") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-14rxrpc: Downgrade the BUG() for unsupported token type in rxrpc_read()David Howells
[ Upstream commit 9a059cd5ca7d9c5c4ca5a6e755cf72f230176b6a ] If rxrpc_read() (which allows KEYCTL_READ to read a key), sees a token of a type it doesn't recognise, it can BUG in a couple of places, which is unnecessary as it can easily get back to userspace. Fix this to print an error message instead. Fixes: 99455153d067 ("RxRPC: Parse security index 5 keys (Kerberos 5)") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-14rxrpc: Fix rxkad token xdr encodingMarc Dionne
[ Upstream commit 56305118e05b2db8d0395bba640ac9a3aee92624 ] The session key should be encoded with just the 8 data bytes and no length; ENCODE_DATA precedes it with a 4 byte length, which confuses some existing tools that try to parse this format. Add an ENCODE_BYTES macro that does not include a length, and use it for the key. Also adjust the expected length. Note that commit 774521f353e1d ("rxrpc: Fix an assertion in rxrpc_read()") had fixed a BUG by changing the length rather than fixing the encoding. The original length was correct. Fixes: 99455153d067 ("RxRPC: Parse security index 5 keys (Kerberos 5)") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09rxrpc: Make rxrpc_kernel_get_srtt() indicate validityDavid Howells
[ Upstream commit 1d4adfaf65746203861c72d9d78de349eb97d528 ] Fix rxrpc_kernel_get_srtt() to indicate the validity of the returned smoothed RTT. If we haven't had any valid samples yet, the SRTT isn't useful. Fixes: c410bf01933e ("rxrpc: Fix the excessive initial retransmission timeout") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-09rxrpc: Keep the ACK serial in a var in rxrpc_input_ack()David Howells
[ Upstream commit 68528d937dcd675e79973061c1a314db598162d1 ] Keep the ACK serial number in a variable in rxrpc_input_ack() as it's used frequently. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-11rxrpc: Fix race between recvmsg and sendmsg on immediate call failureDavid Howells
[ Upstream commit 65550098c1c4db528400c73acf3e46bfa78d9264 ] There's a race between rxrpc_sendmsg setting up a call, but then failing to send anything on it due to an error, and recvmsg() seeing the call completion occur and trying to return the state to the user. An assertion fails in rxrpc_recvmsg() because the call has already been released from the socket and is about to be released again as recvmsg deals with it. (The recvmsg_q queue on the socket holds a ref, so there's no problem with use-after-free.) We also have to be careful not to end up reporting an error twice, in such a way that both returns indicate to userspace that the user ID supplied with the call is no longer in use - which could cause the client to malfunction if it recycles the user ID fast enough. Fix this by the following means: (1) When sendmsg() creates a call after the point that the call has been successfully added to the socket, don't return any errors through sendmsg(), but rather complete the call and let recvmsg() retrieve them. Make sendmsg() return 0 at this point. Further calls to sendmsg() for that call will fail with ESHUTDOWN. Note that at this point, we haven't send any packets yet, so the server doesn't yet know about the call. (2) If sendmsg() returns an error when it was expected to create a new call, it means that the user ID wasn't used. (3) Mark the call disconnected before marking it completed to prevent an oops in rxrpc_release_call(). (4) recvmsg() will then retrieve the error and set MSG_EOR to indicate that the user ID is no longer known by the kernel. An oops like the following is produced: kernel BUG at net/rxrpc/recvmsg.c:605! ... RIP: 0010:rxrpc_recvmsg+0x256/0x5ae ... Call Trace: ? __init_waitqueue_head+0x2f/0x2f ____sys_recvmsg+0x8a/0x148 ? import_iovec+0x69/0x9c ? copy_msghdr_from_user+0x5c/0x86 ___sys_recvmsg+0x72/0xaa ? __fget_files+0x22/0x57 ? __fget_light+0x46/0x51 ? fdget+0x9/0x1b do_recvmmsg+0x15e/0x232 ? _raw_spin_unlock+0xa/0xb ? vtime_delta+0xf/0x25 __x64_sys_recvmmsg+0x2c/0x2f do_syscall_64+0x4c/0x78 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 357f5ef64628 ("rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()") Reported-by: syzbot+b54969381df354936d96@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31rxrpc: Fix sendmsg() returning EPIPE due to recvmsg() returning ENODATADavid Howells
[ Upstream commit 639f181f0ee20d3249dbc55f740f0167267180f0 ] rxrpc_sendmsg() returns EPIPE if there's an outstanding error, such as if rxrpc_recvmsg() indicating ENODATA if there's nothing for it to read. Change rxrpc_recvmsg() to return EAGAIN instead if there's nothing to read as this particular error doesn't get stored in ->sk_err by the networking core. Also change rxrpc_sendmsg() so that it doesn't fail with delayed receive errors (there's no way for it to report which call, if any, the error was caused by). Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-09rxrpc: Fix afs large storage transmission performance dropDavid Howells
[ Upstream commit 02c28dffb13abbaaedece1e4a6493b48ad3f913a ] Commit 2ad6691d988c, which moved the modification of the status annotation for a packet in the Tx buffer prior to the retransmission moved the state clearance, but managed to lose the bit that set it to UNACK. Consequently, if a retransmission occurs, the packet is accidentally changed to the ACK state (ie. 0) by masking it off, which means that the packet isn't counted towards the tally of newly-ACK'd packets if it gets hard-ACK'd. This then prevents the congestion control algorithm from recovering properly. Fix by reinstating the change of state to UNACK. Spotted by the generic/460 xfstest. Fixes: 2ad6691d988c ("rxrpc: Fix race between incoming ACK parser and retransmitter") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-09rxrpc: Fix race between incoming ACK parser and retransmitterDavid Howells
[ Upstream commit 2ad6691d988c0c611362ddc2aad89e0fb50e3261 ] There's a race between the retransmission code and the received ACK parser. The problem is that the retransmission loop has to drop the lock under which it is iterating through the transmission buffer in order to transmit a packet, but whilst the lock is dropped, the ACK parser can crank the Tx window round and discard the packets from the buffer. The retransmission code then updated the annotations for the wrong packet and a later retransmission thought it had to retransmit a packet that wasn't there, leading to a NULL pointer dereference. Fix this by: (1) Moving the annotation change to before we drop the lock prior to transmission. This means we can't vary the annotation depending on the outcome of the transmission, but that's fine - we'll retransmit again later if it failed now. (2) Skipping the packet if the skb pointer is NULL. The following oops was seen: BUG: kernel NULL pointer dereference, address: 000000000000002d Workqueue: krxrpcd rxrpc_process_call RIP: 0010:rxrpc_get_skb+0x14/0x8a ... Call Trace: rxrpc_resend+0x331/0x41e ? get_vtime_delta+0x13/0x20 rxrpc_process_call+0x3c0/0x4ac process_one_work+0x18f/0x27f worker_thread+0x1a3/0x247 ? create_worker+0x17d/0x17d kthread+0xe6/0xeb ? kthread_delayed_work_timer_fn+0x83/0x83 ret_from_fork+0x1f/0x30 Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30rxrpc: Fix handling of rwind from an ACK packetDavid Howells
[ Upstream commit a2ad7c21ad8cf1ce4ad65e13df1c2a1c29b38ac5 ] The handling of the receive window size (rwind) from a received ACK packet is not correct. The rxrpc_input_ackinfo() function currently checks the current Tx window size against the rwind from the ACK to see if it has changed, but then limits the rwind size before storing it in the tx_winsize member and, if it increased, wake up the transmitting process. This means that if rwind > RXRPC_RXTX_BUFF_SIZE - 1, this path will always be followed. Fix this by limiting rwind before we compare it to tx_winsize. The effect of this can be seen by enabling the rxrpc_rx_rwind_change tracepoint. Fixes: 702f2ac87a9a ("rxrpc: Wake up the transmitter if Rx window size increases on the peer") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30rxrpc: Fix notification call on completion of discarded callsDavid Howells
[ Upstream commit 0041cd5a50442db6e456b145892a0eaf2dff061f ] When preallocated service calls are being discarded, they're passed to ->discard_new_call() to have the caller clean up any attached higher-layer preallocated pieces before being marked completed. However, the act of marking them completed now invokes the call's notification function - which causes a problem because that function might assume that the previously freed pieces of memory are still there. Fix this by setting a dummy notification function on the socket after calling ->discard_new_call(). This results in the following kasan message when the kafs module is removed. ================================================================== BUG: KASAN: use-after-free in afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707 Write of size 1 at addr ffff8880946c39e4 by task kworker/u4:1/21 CPU: 0 PID: 21 Comm: kworker/u4:1 Not tainted 5.8.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x18f/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 afs_wake_up_async_call+0x6aa/0x770 fs/afs/rxrpc.c:707 rxrpc_notify_socket+0x1db/0x5d0 net/rxrpc/recvmsg.c:40 __rxrpc_set_call_completion.part.0+0x172/0x410 net/rxrpc/recvmsg.c:76 __rxrpc_call_completed net/rxrpc/recvmsg.c:112 [inline] rxrpc_call_completed+0xca/0xf0 net/rxrpc/recvmsg.c:111 rxrpc_discard_prealloc+0x781/0xab0 net/rxrpc/call_accept.c:233 rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245 afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110 afs_net_exit+0x1bc/0x310 fs/afs/main.c:155 ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186 cleanup_net+0x511/0xa50 net/core/net_namespace.c:603 process_one_work+0x965/0x1690 kernel/workqueue.c:2269 worker_thread+0x96/0xe10 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 Allocated by task 6820: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc mm/kasan/common.c:494 [inline] __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:467 kmem_cache_alloc_trace+0x153/0x7d0 mm/slab.c:3551 kmalloc include/linux/slab.h:555 [inline] kzalloc include/linux/slab.h:669 [inline] afs_alloc_call+0x55/0x630 fs/afs/rxrpc.c:141 afs_charge_preallocation+0xe9/0x2d0 fs/afs/rxrpc.c:757 afs_open_socket+0x292/0x360 fs/afs/rxrpc.c:92 afs_net_init+0xa6c/0xe30 fs/afs/main.c:125 ops_init+0xaf/0x420 net/core/net_namespace.c:151 setup_net+0x2de/0x860 net/core/net_namespace.c:341 copy_net_ns+0x293/0x590 net/core/net_namespace.c:482 create_new_namespaces+0x3fb/0xb30 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0xbd/0x1f0 kernel/nsproxy.c:231 ksys_unshare+0x43d/0x8e0 kernel/fork.c:2983 __do_sys_unshare kernel/fork.c:3051 [inline] __se_sys_unshare kernel/fork.c:3049 [inline] __x64_sys_unshare+0x2d/0x40 kernel/fork.c:3049 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 21: save_stack+0x1b/0x40 mm/kasan/common.c:48 set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:455 __cache_free mm/slab.c:3426 [inline] kfree+0x109/0x2b0 mm/slab.c:3757 afs_put_call+0x585/0xa40 fs/afs/rxrpc.c:190 rxrpc_discard_prealloc+0x764/0xab0 net/rxrpc/call_accept.c:230 rxrpc_listen+0x147/0x360 net/rxrpc/af_rxrpc.c:245 afs_close_socket+0x95/0x320 fs/afs/rxrpc.c:110 afs_net_exit+0x1bc/0x310 fs/afs/main.c:155 ops_exit_list.isra.0+0xa8/0x150 net/core/net_namespace.c:186 cleanup_net+0x511/0xa50 net/core/net_namespace.c:603 process_one_work+0x965/0x1690 kernel/workqueue.c:2269 worker_thread+0x96/0xe10 kernel/workqueue.c:2415 kthread+0x3b5/0x4a0 kernel/kthread.c:291 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 The buggy address belongs to the object at ffff8880946c3800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 484 bytes inside of 1024-byte region [ffff8880946c3800, ffff8880946c3c00) The buggy address belongs to the page: page:ffffea000251b0c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea0002546508 ffffea00024fa248 ffff8880aa000c40 raw: 0000000000000000 ffff8880946c3000 0000000100000002 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880946c3880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880946c3900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880946c3980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880946c3a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880946c3a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Reported-by: syzbot+d3eccef36ddbd02713e9@syzkaller.appspotmail.com Fixes: 5ac0d62226a0 ("rxrpc: Fix missing notification") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-24rxrpc: Adjust /proc/net/rxrpc/calls to display call->debug_id not user_IDDavid Howells
[ Upstream commit 32f71aa497cfb23d37149c2ef16ad71fce2e45e2 ] The user ID value isn't actually much use - and leaks a kernel pointer or a userspace value - so replace it with the call debug ID, which appears in trace points. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27rxrpc: Fix ack discardDavid Howells
[ Upstream commit 441fdee1eaf050ef0040bde0d7af075c1c6a6d8b ] The Rx protocol has a "previousPacket" field in it that is not handled in the same way by all protocol implementations. Sometimes it contains the serial number of the last DATA packet received, sometimes the sequence number of the last DATA packet received and sometimes the highest sequence number so far received. AF_RXRPC is using this to weed out ACKs that are out of date (it's possible for ACK packets to get reordered on the wire), but this does not work with OpenAFS which will just stick the sequence number of the last packet seen into previousPacket. The issue being seen is that big AFS FS.StoreData RPC (eg. of ~256MiB) are timing out when partly sent. A trace was captured, with an additional tracepoint to show ACKs being discarded in rxrpc_input_ack(). Here's an excerpt showing the problem. 52873.203230: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 0002449c q=00024499 fl=09 A DATA packet with sequence number 00024499 has been transmitted (the "q=" field). ... 52873.243296: rxrpc_rx_ack: c=000004ae 00012a2b DLY r=00024499 f=00024497 p=00024496 n=0 52873.243376: rxrpc_rx_ack: c=000004ae 00012a2c IDL r=0002449b f=00024499 p=00024498 n=0 52873.243383: rxrpc_rx_ack: c=000004ae 00012a2d OOS r=0002449d f=00024499 p=0002449a n=2 The Out-Of-Sequence ACK indicates that the server didn't see DATA sequence number 00024499, but did see seq 0002449a (previousPacket, shown as "p=", skipped the number, but firstPacket, "f=", which shows the bottom of the window is set at that point). 52873.252663: rxrpc_retransmit: c=000004ae q=24499 a=02 xp=14581537 52873.252664: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244bc q=00024499 fl=0b *RETRANS* The packet has been retransmitted. Retransmission recurs until the peer says it got the packet. 52873.271013: rxrpc_rx_ack: c=000004ae 00012a31 OOS r=000244a1 f=00024499 p=0002449e n=6 More OOS ACKs indicate that the other packets that are already in the transmission pipeline are being received. The specific-ACK list is up to 6 ACKs and NAKs. ... 52873.284792: rxrpc_rx_ack: c=000004ae 00012a49 OOS r=000244b9 f=00024499 p=000244b6 n=30 52873.284802: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=63505500 52873.284804: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c2 q=00024499 fl=0b *RETRANS* 52873.287468: rxrpc_rx_ack: c=000004ae 00012a4a OOS r=000244ba f=00024499 p=000244b7 n=31 52873.287478: rxrpc_rx_ack: c=000004ae 00012a4b OOS r=000244bb f=00024499 p=000244b8 n=32 At this point, the server's receive window is full (n=32) with presumably 1 NAK'd packet and 31 ACK'd packets. We can't transmit any more packets. 52873.287488: rxrpc_retransmit: c=000004ae q=24499 a=0a xp=61327980 52873.287489: rxrpc_tx_data: c=000004ae DATA ed1a3584:00000002 000244c3 q=00024499 fl=0b *RETRANS* 52873.293850: rxrpc_rx_ack: c=000004ae 00012a4c DLY r=000244bc f=000244a0 p=00024499 n=25 And now we've received an ACK indicating that a DATA retransmission was received. 7 packets have been processed (the occupied part of the window moved, as indicated by f= and n=). 52873.293853: rxrpc_rx_discard_ack: c=000004ae r=00012a4c 000244a0<00024499 00024499<000244b8 However, the DLY ACK gets discarded because its previousPacket has gone backwards (from p=000244b8, in the ACK at 52873.287478 to p=00024499 in the ACK at 52873.293850). We then end up in a continuous cycle of retransmit/discard. kafs fails to update its window because it's discarding the ACKs and can't transmit an extra packet that would clear the issue because the window is full. OpenAFS doesn't change the previousPacket value in the ACKs because no new DATA packets are received with a different previousPacket number. Fix this by altering the discard check to only discard an ACK based on previousPacket if there was no advance in the firstPacket. This allows us to transmit a new packet which will cause previousPacket to advance in the next ACK. The check, however, needs to allow for the possibility that previousPacket may actually have had the serial number placed in it instead - in which case it will go outside the window and we should ignore it. Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks") Reported-by: Dave Botsch <botsch@cnf.cornell.edu> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27rxrpc: Trace discarded ACKsDavid Howells
[ Upstream commit d1f129470e6cb79b8b97fecd12689f6eb49e27fe ] Add a tracepoint to track received ACKs that are discarded due to being outside of the Tx window. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27rxrpc: Fix a memory leak in rxkad_verify_response()Qiushi Wu
commit f45d01f4f30b53c3a0a1c6c1c154acb7ff74ab9f upstream. A ticket was not released after a call of the function "rxkad_decrypt_ticket" failed. Thus replace the jump target "temporary_error_free_resp" by "temporary_error_free_ticket". Fixes: 8c2f826dc3631 ("rxrpc: Don't put crypto buffers on the stack") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: David Howells <dhowells@redhat.com> cc: Markus Elfring <Markus.Elfring@web.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27rxrpc: Fix the excessive initial retransmission timeoutDavid Howells
commit c410bf01933e5e09d142c66c3df9ad470a7eec13 upstream. rxrpc currently uses a fixed 4s retransmission timeout until the RTT is sufficiently sampled. This can cause problems with some fileservers with calls to the cache manager in the afs filesystem being dropped from the fileserver because a packet goes missing and the retransmission timeout is greater than the call expiry timeout. Fix this by: (1) Copying the RTT/RTO calculation code from Linux's TCP implementation and altering it to fit rxrpc. (2) Altering the various users of the RTT to make use of the new SRTT value. (3) Replacing the use of rxrpc_resend_timeout to use the calculated RTO value instead (which is needed in jiffies), along with a backoff. Notes: (1) rxrpc provides RTT samples by matching the serial numbers on outgoing DATA packets that have the RXRPC_REQUEST_ACK set and PING ACK packets against the reference serial number in incoming REQUESTED ACK and PING-RESPONSE ACK packets. (2) Each packet that is transmitted on an rxrpc connection gets a new per-connection serial number, even for retransmissions, so an ACK can be cross-referenced to a specific trigger packet. This allows RTT information to be drawn from retransmitted DATA packets also. (3) rxrpc maintains the RTT/RTO state on the rxrpc_peer record rather than on an rxrpc_call because many RPC calls won't live long enough to generate more than one sample. (4) The calculated SRTT value is in units of 8ths of a microsecond rather than nanoseconds. The (S)RTT and RTO values are displayed in /proc/net/rxrpc/peers. Fixes: 17926a79320a ([AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both"") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-02rxrpc: Fix DATA Tx to disable nofrag for UDP on AF_INET6 socketDavid Howells
commit 0e631eee17dcea576ab922fa70e4fdbd596ee452 upstream. Fix the DATA packet transmission to disable nofrag for UDPv4 on an AF_INET6 socket as well as UDPv6 when trying to transmit fragmentably. Without this, packets filled to the normal size used by the kernel AFS client of 1412 bytes be rejected by udp_sendmsg() with EMSGSIZE immediately. The ->sk_error_report() notification hook is called, but rxrpc doesn't generate a trace for it. This is a temporary fix; a more permanent solution needs to involve changing the size of the packets being filled in accordance with the MTU, which isn't currently done in AF_RXRPC. The reason for not doing so was that, barring the last packet in an rx jumbo packet, jumbos can only be assembled out of 1412-byte packets - and the plan was to construct jumbos on the fly at transmission time. Also, there's no point turning on IPV6_MTU_DISCOVER, since IPv6 has to engage in this anyway since fragmentation is only done by the sender. We can then condense the switch-statement in rxrpc_send_data_packet(). Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-23KEYS: Don't write out to userspace while holding key semaphoreWaiman Long
commit d3ec10aa95819bff18a0d936b18884c7816d0914 upstream. A lockdep circular locking dependency report was seen when running a keyutils test: [12537.027242] ====================================================== [12537.059309] WARNING: possible circular locking dependency detected [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - [12537.125253] ------------------------------------------------------ [12537.153189] keyctl/25598 is trying to acquire lock: [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 [12537.208365] [12537.208365] but task is already holding lock: [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12537.270476] [12537.270476] which lock already depends on the new lock. [12537.270476] [12537.307209] [12537.307209] the existing dependency chain (in reverse order) is: [12537.340754] [12537.340754] -> #3 (&type->lock_class){++++}: [12537.367434] down_write+0x4d/0x110 [12537.385202] __key_link_begin+0x87/0x280 [12537.405232] request_key_and_link+0x483/0xf70 [12537.427221] request_key+0x3c/0x80 [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.601045] kthread+0x30c/0x3d0 [12537.617906] ret_from_fork+0x3a/0x50 [12537.636225] [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: [12537.664525] __mutex_lock+0x105/0x11f0 [12537.683734] request_key_and_link+0x35a/0xf70 [12537.705640] request_key+0x3c/0x80 [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.873477] kthread+0x30c/0x3d0 [12537.890281] ret_from_fork+0x3a/0x50 [12537.908649] [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: [12537.935225] __mutex_lock+0x105/0x11f0 [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] [12538.023920] read_pages+0xf5/0x560 [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 [12538.067047] ondemand_readahead+0x44c/0xc10 [12538.092069] filemap_fault+0xec1/0x1830 [12538.111637] __do_fault+0x82/0x260 [12538.129216] do_fault+0x419/0xfb0 [12538.146390] __handle_mm_fault+0x862/0xdf0 [12538.167408] handle_mm_fault+0x154/0x550 [12538.187401] __do_page_fault+0x42f/0xa60 [12538.207395] do_page_fault+0x38/0x5e0 [12538.225777] page_fault+0x1e/0x30 [12538.243010] [12538.243010] -> #0 (&mm->mmap_sem){++++}: [12538.267875] lock_acquire+0x14c/0x420 [12538.286848] __might_fault+0x119/0x1b0 [12538.306006] keyring_read_iterator+0x7e/0x170 [12538.327936] assoc_array_subtree_iterate+0x97/0x280 [12538.352154] keyring_read+0xe9/0x110 [12538.370558] keyctl_read_key+0x1b9/0x220 [12538.391470] do_syscall_64+0xa5/0x4b0 [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [12538.435535] [12538.435535] other info that might help us debug this: [12538.435535] [12538.472829] Chain exists of: [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class [12538.472829] [12538.524820] Possible unsafe locking scenario: [12538.524820] [12538.551431] CPU0 CPU1 [12538.572654] ---- ---- [12538.595865] lock(&type->lock_class); [12538.613737] lock(root_key_user.cons_lock); [12538.644234] lock(&type->lock_class); [12538.672410] lock(&mm->mmap_sem); [12538.687758] [12538.687758] *** DEADLOCK *** [12538.687758] [12538.714455] 1 lock held by keyctl/25598: [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12538.770573] [12538.770573] stack backtrace: [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [12538.881963] Call Trace: [12538.892897] dump_stack+0x9a/0xf0 [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 [12538.932891] ? save_trace+0xd6/0x250 [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 [12538.971643] ? keyring_compare_object+0x104/0x190 [12538.992738] ? check_usage+0x550/0x550 [12539.009845] ? sched_clock+0x5/0x10 [12539.025484] ? sched_clock_cpu+0x18/0x1e0 [12539.043555] __lock_acquire+0x1f12/0x38d0 [12539.061551] ? trace_hardirqs_on+0x10/0x10 [12539.080554] lock_acquire+0x14c/0x420 [12539.100330] ? __might_fault+0xc4/0x1b0 [12539.119079] __might_fault+0x119/0x1b0 [12539.135869] ? __might_fault+0xc4/0x1b0 [12539.153234] keyring_read_iterator+0x7e/0x170 [12539.172787] ? keyring_read+0x110/0x110 [12539.190059] assoc_array_subtree_iterate+0x97/0x280 [12539.211526] keyring_read+0xe9/0x110 [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 [12539.249076] keyctl_read_key+0x1b9/0x220 [12539.266660] do_syscall_64+0xa5/0x4b0 [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf One way to prevent this deadlock scenario from happening is to not allow writing to userspace while holding the key semaphore. Instead, an internal buffer is allocated for getting the keys out from the read method first before copying them out to userspace without holding the lock. That requires taking out the __user modifier from all the relevant read methods as well as additional changes to not use any userspace write helpers. That is, 1) The put_user() call is replaced by a direct copy. 2) The copy_to_user() call is replaced by memcpy(). 3) All the fault handling code is removed. Compiling on a x86-64 system, the size of the rxrpc_read() function is reduced from 3795 bytes to 2384 bytes with this patch. Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17rxrpc: Fix call interruptibility handlingDavid Howells
[ Upstream commit e138aa7d3271ac1b0690ae2c9b04d51468dce1d6 ] Fix the interruptibility of kernel-initiated client calls so that they're either only interruptible when they're waiting for a call slot to come available or they're not interruptible at all. Either way, they're not interruptible during transmission. This should help prevent StoreData calls from being interrupted when writeback is in progress. It doesn't, however, handle interruption during the receive phase. Userspace-initiated calls are still interruptable. After the signal has been handled, sendmsg() will return the amount of data copied out of the buffer and userspace can perform another sendmsg() call to continue transmission. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17rxrpc: Abstract out the calculation of whether there's Tx spaceDavid Howells
[ Upstream commit 158fe6665389964a1de212818b4a5c52b7f7aff4 ] Abstract out the calculation of there being sufficient Tx buffer space. This is reproduced several times in the rxrpc sendmsg code. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-08rxrpc: Fix sendmsg(MSG_WAITALL) handlingDavid Howells
commit 498b577660f08cef5d9e78e0ed6dcd4c0939e98c upstream. Fix the handling of sendmsg() with MSG_WAITALL for userspace to round the timeout for when a signal occurs up to at least two jiffies as a 1 jiffy timeout may end up being effectively 0 if jiffies wraps at the wrong time. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-01afs: Fix client call Rx-phase signal handlingDavid Howells
commit 7d7587db0d7fd1138f2afcffdc46a8e15630b944 upstream. Fix the handling of signals in client rxrpc calls made by the afs filesystem. Ignore signals completely, leaving call abandonment or connection loss to be detected by timeouts inside AF_RXRPC. Allowing a filesystem call to be interrupted after the entire request has been transmitted and an abort sent means that the server may or may not have done the action - and we don't know. It may even be worse than that for older servers. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-28rxrpc: Fix call RCU cleanup using non-bh-safe locksDavid Howells
commit 963485d436ccc2810177a7b08af22336ec2af67b upstream. rxrpc_rcu_destroy_call(), which is called as an RCU callback to clean up a put call, calls rxrpc_put_connection() which, deep in its bowels, takes a number of spinlocks in a non-BH-safe way, including rxrpc_conn_id_lock and local->client_conns_lock. RCU callbacks, however, are normally called from softirq context, which can cause lockdep to notice the locking inconsistency. To get lockdep to detect this, it's necessary to have the connection cleaned up on the put at the end of the last of its calls, though normally the clean up is deferred. This can be induced, however, by starting a call on an AF_RXRPC socket and then closing the socket without reading the reply. Fix this by having rxrpc_rcu_destroy_call() punt the destruction to a workqueue if in softirq-mode and defer the destruction to process context. Note that another way to fix this could be to add a bunch of bh-disable annotations to the spinlocks concerned - and there might be more than just those two - but that means spending more time with BHs disabled. Note also that some of these places were covered by bh-disable spinlocks belonging to the rxrpc_transport object, but these got removed without the _bh annotation being retained on the next lock in. Fixes: 999b69f89241 ("rxrpc: Kill the client connection bundle concept") Reported-by: syzbot+d82f3ac8d87e7ccbb2c9@syzkaller.appspotmail.com Reported-by: syzbot+3f1fd6b8cbf8702d134e@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Hillf Danton <hdanton@sina.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-11rxrpc: Fix service call disconnectionDavid Howells
[ Upstream commit b39a934ec72fa2b5a74123891f25273a38378b90 ] The recent patch that substituted a flag on an rxrpc_call for the connection pointer being NULL as an indication that a call was disconnected puts the set_bit in the wrong place for service calls. This is only a problem if a call is implicitly terminated by a new call coming in on the same connection channel instead of a terminating ACK packet. In such a case, rxrpc_input_implicit_end_call() calls __rxrpc_disconnect_call(), which is now (incorrectly) setting the disconnection bit, meaning that when rxrpc_release_call() is later called, it doesn't call rxrpc_disconnect_call() and so the call isn't removed from the peer's error distribution list and the list gets corrupted. KASAN finds the issue as an access after release on a call, but the position at which it occurs is confusing as it appears to be related to a different call (the call site is where the latter call is being removed from the error distribution list and either the next or pprev pointer points to a previously released call). Fix this by moving the setting of the flag from __rxrpc_disconnect_call() to rxrpc_disconnect_call() in the same place that the connection pointer was being cleared. Fixes: 5273a191dca6 ("rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>