summaryrefslogtreecommitdiffstats
path: root/net/sunrpc/xprtsock.c
AgeCommit message (Collapse)Author
2020-04-17SUNRPC: Fix backchannel RPC soft lockupsChuck Lever
Currently, after the forward channel connection goes away, backchannel operations are causing soft lockups on the server because call_transmit_status's SOFTCONN logic ignores ENOTCONN. Such backchannel Calls are aggressively retried until the client reconnects. Backchannel Calls should use RPC_TASK_NOCONNECT rather than RPC_TASK_SOFTCONN. If there is no forward connection, the server is not capable of establishing a connection back to the client, thus that backchannel request should fail before the server attempts to send it. Commit 58255a4e3ce5 ("NFSD: NFSv4 callback client should use RPC_TASK_SOFTCONN") was merged several years before RPC_TASK_NOCONNECT was available. Because setup_callback_client() explicitly sets NOPING, the NFSv4.0 callback connection depends on the first callback RPC to initiate a connection to the client. Thus NFSv4.0 needs to continue to use RPC_TASK_SOFTCONN. Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> # v4.20+
2020-04-07Merge tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page leak in nfs_destroy_unlinked_subrequests() - Fix use-after-free issues in nfs_pageio_add_request() - Fix new mount code constant_table array definitions - finish_automount() requires us to hold 2 refs to the mount record Features: - Improve the accuracy of telldir/seekdir by using 64-bit cookies when possible. - Allow one RDMA active connection and several zombie connections to prevent blocking if the remote server is unresponsive. - Limit the size of the NFS access cache by default - Reduce the number of references to credentials that are taken by NFS - pNFS files and flexfiles drivers now support per-layout segment COMMIT lists. - Enable partial-file layout segments in the pNFS/flexfiles driver. - Add support for CB_RECALL_ANY to the pNFS flexfiles layout type - pNFS/flexfiles Report NFS4ERR_DELAY and NFS4ERR_GRACE errors from the DS using the layouterror mechanism. Bugfixes and cleanups: - SUNRPC: Fix krb5p regressions - Don't specify NFS version in "UDP not supported" error - nfsroot: set tcp as the default transport protocol - pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() - alloc_nfs_open_context() must use the file cred when available - Fix locking when dereferencing the delegation cred - Fix memory leaks in O_DIRECT when nfs_get_lock_context() fails - Various clean ups of the NFS O_DIRECT commit code - Clean up RDMA connect/disconnect - Replace zero-length arrays with C99-style flexible arrays" * tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (86 commits) NFS: Clean up process of marking inode stale. SUNRPC: Don't start a timer on an already queued rpc task NFS/pnfs: Reference the layout cred in pnfs_prepare_layoutreturn() NFS/pnfs: Fix dereference of layout cred in pnfs_layoutcommit_inode() NFS: Beware when dereferencing the delegation cred NFS: Add a module parameter to set nfs_mountpoint_expiry_timeout NFS: finish_automount() requires us to hold 2 refs to the mount record NFS: Fix a few constant_table array definitions NFS: Try to join page groups before an O_DIRECT retransmission NFS: Refactor nfs_lock_and_join_requests() NFS: Reverse the submission order of requests in __nfs_pageio_add_request() NFS: Clean up nfs_lock_and_join_requests() NFS: Remove the redundant function nfs_pgio_has_mirroring() NFS: Fix memory leaks in nfs_pageio_stop_mirroring() NFS: Fix a request reference leak in nfs_direct_write_clear_reqs() NFS: Fix use-after-free issues in nfs_pageio_add_request() NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests() NFS: Fix a page leak in nfs_destroy_unlinked_subrequests() NFS: Remove unused FLUSH_SYNC support in nfs_initiate_pgio() pNFS/flexfiles: Specify the layout segment range in LAYOUTGET ...
2020-03-16SUNRPC: Teach server to use xprt_sock_sendmsg for socket sendsChuck Lever
xprt_sock_sendmsg uses the more efficient iov_iter-enabled kernel socket API, and is a pre-requisite for server send-side support for TLS. Note that svc_process no longer needs to reserve a word for the stream record marker, since the TCP transport now provides the record marker automatically in a separate buffer. The dprintk() in svc_send_common is also removed. It didn't seem crucial for field troubleshooting. If more is needed there, a trace point could be added in xprt_sock_sendmsg(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16SUNRPC: Refactor xs_sendpages()Chuck Lever
Re-locate xs_sendpages() so that it can be shared with server code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-03-16SUNRPC: remove redundant assignments to variable statusColin Ian King
The variable status is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-12-07Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "This is a relatively quiet cycle for nfsd, mainly various bugfixes. Possibly most interesting is Trond's fixes for some callback races that were due to my incomplete understanding of rpc client shutdown. Unfortunately at the last minute I've started noticing a new intermittent failure to send callbacks. As the logic seems basically correct, I'm leaving Trond's patches in for now, and hope to find a fix in the next week so I don't have to revert those patches" * tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits) nfsd: depend on CRYPTO_MD5 for legacy client tracking NFSD fixing possible null pointer derefering in copy offload nfsd: check for EBUSY from vfs_rmdir/vfs_unink. nfsd: Ensure CLONE persists data and metadata changes to the target file SUNRPC: Fix backchannel latency metrics nfsd: restore NFSv3 ACL support nfsd: v4 support requires CRYPTO_SHA256 nfsd: Fix cld_net->cn_tfm initialization lockd: remove __KERNEL__ ifdefs sunrpc: remove __KERNEL__ ifdefs race in exportfs_decode_fh() nfsd: Drop LIST_HEAD where the variable it declares is never used. nfsd: document callback_wq serialization of callback code nfsd: mark cb path down on unknown errors nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() nfsd: minor 4.1 callback cleanup SUNRPC: Fix svcauth_gss_proxy_init() SUNRPC: Trace gssproxy upcall results sunrpc: fix crash when cache_head become valid before update nfsd: remove private bin2hex implementation ...
2019-11-21SUNRPC: Fix backchannel latency metricsChuck Lever
I noticed that for callback requests, the reported backlog latency is always zero, and the rtt value is crazy big. The problem was that rqst->rq_xtime is never set for backchannel requests. Fixes: 78215759e20d ("SUNRPC: Make RTT measurement more ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-11-03NFSv4.1: Don't rebind to the same source port when reconnecting to the serverTrond Myklebust
NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look at the source port when deciding whether or not an RPC call is a replay of a previous call. This requires clients to perform strange TCP gymnastics in order to ensure that when they reconnect to the server, they bind to the same source port. NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics, that do not look at the source port of the connection. This patch therefore ensures they can ignore the rebind requirement. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-10-10SUNRPC: fix race to sk_err after xs_error_reportBenjamin Coddington
Since commit 4f8943f80883 ("SUNRPC: Replace direct task wakeups from softirq context") there has been a race to the value of the sk_err if both XPRT_SOCK_WAKE_ERROR and XPRT_SOCK_WAKE_DISCONNECT are set. In that case, we may end up losing the sk_err value that existed when xs_error_report was called. Fix this by reverting to the previous behavior: instead of using SO_ERROR to retrieve the value at a later time (which might also return sk_err_soft), copy the sk_err value onto struct sock_xprt, and use that value to wake pending tasks. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 4f8943f80883 ("SUNRPC: Replace direct task wakeups from softirq context") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-09-17SUNRPC: Don't receive TCP data into a request buffer that has been resetTrond Myklebust
If we've removed the request from the receive list, and have added it back after resetting the request receive buffer, then we should only receive message data if it is a new reply (i.e. if transport->recv.copied is zero). Fixes: 277e4ab7d530b ("SUNRPC: Simplify TCP receive code by switching...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-07-18SUNRPC: Ensure the bvecs are reset when we re-encode the RPC requestTrond Myklebust
The bvec tracks the list of pages, so if the number of pages changes due to a re-encode, we need to reset the bvec as well. Fixes: 277e4ab7d530 ("SUNRPC: Simplify TCP receive code by switching...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20+
2019-07-18SUNRPC: Fix up backchannel slot table accountingTrond Myklebust
Add a per-transport maximum limit in the socket case, and add helpers to allow the NFSv4 code to discover that limit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-12Merge tag 'nfs-rdma-for-5.3-1' of ↵Trond Myklebust
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA client updates for 5.3 New features: - Add a way to place MRs back on the free list - Reduce context switching - Add new trace events Bugfixes and cleanups: - Fix a BUG when tracing is enabled with NFSv4.1 - Fix a use-after-free in rpcrdma_post_recvs - Replace use of xdr_stream_pos in rpcrdma_marshal_req - Fix occasional transport deadlock - Fix show_nfs_errors macros, other tracing improvements - Remove RPCRDMA_REQ_F_PENDING and fr_state - Various simplifications and refactors
2019-07-09xprtrdma: Modernize ops->connectChuck Lever
Adapt and apply changes that were made to the TCP socket connect code. See the following commits for details on the purpose of these changes: Commit 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly") Commit 3851f1cdb2b8 ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout") Commit 02910177aede ("SUNRPC: Fix reconnection timeouts") Some common transport code is moved to xprt.c to satisfy the code duplication police. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-07-06SUNRPC: Remove the bh-safe lock requirement on xprt->transport_lockTrond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-06SUNRPC: Replace direct task wakeups from softirq contextTrond Myklebust
Replace the direct task wakeups from inside a softirq context with wakeups from a process context. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-06-28SUNRPC: Fix up calculation of client message lengthTrond Myklebust
In the case where a record marker was used, xs_sendpages() needs to return the length of the payload + record marker so that we operate correctly in the case of a partial transmission. When the callers check return value, they therefore need to take into account the record marker length. Fixes: 06b5fc3ad94e ("Merge tag 'nfs-rdma-for-5.1-1'...") Cc: stable@vger.kernel.org # 5.1+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-04-25SUNRPC: Add tracking of RPC level errorsTrond Myklebust
Add variables to track RPC level errors so that we can distinguish between issue that arose in the RPC transport layer as opposed to those arising from the reply message. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-04-25SUNRPC: Refactor xprt_request_wait_receive()Trond Myklebust
Convert the transport callback to actually put the request to sleep instead of just setting a timeout. This is in preparation for rpc_sleep_on_timeout(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-03-26SUNRPC: fix uninitialized variable warningAlakesh Haloi
Avoid following compiler warning on uninitialized variable net/sunrpc/xprtsock.c: In function ‘xs_read_stream_request.constprop’: net/sunrpc/xprtsock.c:525:10: warning: ‘read’ may be used uninitialized in this function [-Wmaybe-uninitialized] return read; ^~~~ net/sunrpc/xprtsock.c:529:23: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized] return ret < 0 ? ret : read; ~~~~~~~~~~~~~~^~~~~~ Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-15SUNRPC: Fix a client regression when handling oversized repliesTrond Myklebust
If the server sends a reply that is larger than the pre-allocated buffer, then the current code may fail to register how much of the stream that it has finished reading. This again can lead to hangs. Fixes: e92053a52e68 ("SUNRPC: Handle zero length fragments correctly") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-03-02SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpcTrond Myklebust
Convert the remaining gfp_flags arguments in sunrpc to standard reclaiming allocations, now that we set memalloc_nofs_save() as appropriate. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-26SUNRPC: Fix an Oops in udp_poll()Trond Myklebust
udp_poll() checks the struct file for the O_NONBLOCK flag, so we must not call it with a NULL file pointer. Fixes: 0ffe86f48026 ("SUNRPC: Use poll() to fix up the socket requeue races") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-25Merge tag 'nfs-rdma-for-5.1-1' of ↵Trond Myklebust
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA client updates for 5.1 New features: - Convert rpc auth layer to use xdr_streams - Config option to disable insecure enctypes - Reduce size of RPC receive buffers Bugfixes and cleanups: - Fix sparse warnings - Check inline size before providing a write chunk - Reduce the receive doorbell rate - Various tracepoint improvements [Trond: Fix up merge conflicts] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Remove the redundant 'zerocopy' argument to xs_sendpages()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Further cleanups of xs_sendpages()Trond Myklebust
Now that we send the pages using a struct msghdr, instead of using sendpage(), we no longer need to 'prime the socket' with an address for unconnected UDP messages. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Convert socket page send code to use iov_iter()Trond Myklebust
Simplify the page send code using iov_iter and bvecs. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Convert xs_send_kvec() to use iov_iter_kvec()Trond Myklebust
Prepare to the socket transmission code to use iov_iter. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Initiate a connection close on an ESHUTDOWN error in stream receiveTrond Myklebust
If the client stream receive code receives an ESHUTDOWN error either because the server closed the connection, or because it sent a callback which cannot be processed, then we should shut down the connection. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Don't suppress socket errors when a message read completesTrond Myklebust
If the message read completes, but the socket returned an error condition, we should ensure to propagate that error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Handle zero length fragments correctlyTrond Myklebust
A zero length fragment is really a bug, but let's ensure we don't go nuts when one turns up. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Don't reset the stream record info when the receive worker is runningTrond Myklebust
To ensure that the receive worker has exclusive access to the stream record info, we must not reset the contents other than when holding the transport->recv_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Ensure rq_bytes_sent is reset before request transmissionTrond Myklebust
When we resend a request, ensure that the 'rq_bytes_sent' is reset to zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Use poll() to fix up the socket requeue racesTrond Myklebust
Because we clear XPRT_SOCK_DATA_READY before reading, we can end up with a situation where new data arrives, causing xs_data_ready() to queue up a second receive worker job for the same socket, which then immediately gets stuck waiting on the transport receive mutex. The fix is to only clear XPRT_SOCK_DATA_READY once we're done reading, and then to use poll() to check if we might need to queue up a new job in order to deal with any new data. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-20SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobsTrond Myklebust
Set memalloc_nofs_save() on all the rpciod/xprtiod jobs so that we ensure memory allocations for asynchronous rpc calls don't ever end up recursing back to the NFS layer for memory reclaim. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-02-13SUNRPC: Remove rpc_xprt::tsh_sizeChuck Lever
tsh_size was added to accommodate transports that send a pre-amble before each RPC message. However, this assumes the pre-amble is fixed in size, which isn't true for some transports. That makes tsh_size not very generic. Also I'd like to make the estimation of RPC send and receive buffer sizes more precise. tsh_size doesn't currently appear to be accounted for at all by call_allocate. Therefore let's just remove the tsh_size concept, and make the only transports that have a non-zero tsh_size employ a direct approach. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-01-08SUNRPC: Fix TCP receive code on archs with flush_dcache_page()Trond Myklebust
After receiving data into the page cache, we need to call flush_dcache_page() for the architectures that define it. Fixes: 277e4ab7d530b ("SUNRPC: Simplify TCP receive code by switching...") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.20 Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-01-02Merge tag 'nfs-for-4.21-1' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds
Pull NFS client updates from Anna Schumaker: "Stable bugfixes: - xprtrdma: Yet another double DMA-unmap # v4.20 Features: - Allow some /proc/sys/sunrpc entries without CONFIG_SUNRPC_DEBUG - Per-xprt rdma receive workqueues - Drop support for FMR memory registration - Make port= mount option optional for RDMA mounts Other bugfixes and cleanups: - Remove unused nfs4_xdev_fs_type declaration - Fix comments for behavior that has changed - Remove generic RPC credentials by switching to 'struct cred' - Fix crossing mountpoints with different auth flavors - Various xprtrdma fixes from testing and auditing the close code - Fixes for disconnect issues when using xprtrdma with krb5 - Clean up and improve xprtrdma trace points - Fix NFS v4.2 async copy reboot recovery" * tag 'nfs-for-4.21-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (63 commits) sunrpc: convert to DEFINE_SHOW_ATTRIBUTE sunrpc: Add xprt after nfs4_test_session_trunk() sunrpc: convert unnecessary GFP_ATOMIC to GFP_NOFS sunrpc: handle ENOMEM in rpcb_getport_async NFS: remove unnecessary test for IS_ERR(cred) xprtrdma: Prevent leak of rpcrdma_rep objects NFSv4.2 fix async copy reboot recovery xprtrdma: Don't leak freed MRs xprtrdma: Add documenting comment for rpcrdma_buffer_destroy xprtrdma: Replace outdated comment for rpcrdma_ep_post xprtrdma: Update comments in frwr_op_send SUNRPC: Fix some kernel doc complaints SUNRPC: Simplify defining common RPC trace events NFS: Fix NFSv4 symbolic trace point output xprtrdma: Trace mapping, alloc, and dereg failures xprtrdma: Add trace points for calls to transport switch methods xprtrdma: Relocate the xprtrdma_mr_map trace points xprtrdma: Clean up of xprtrdma chunk trace points xprtrdma: Remove unused fields from rpcrdma_ia xprtrdma: Cull dprintk() call sites ...
2019-01-02Merge tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linuxLinus Torvalds
Pull nfsd updates from Bruce Fields: "Thanks to Vasily Averin for fixing a use-after-free in the containerized NFSv4.2 client, and cleaning up some convoluted backchannel server code in the process. Otherwise, miscellaneous smaller bugfixes and cleanup" * tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux: (25 commits) nfs: fixed broken compilation in nfs_callback_up_net() nfs: minor typo in nfs4_callback_up_net() sunrpc: fix debug message in svc_create_xprt() sunrpc: make visible processing error in bc_svc_process() sunrpc: remove unused xpo_prep_reply_hdr callback sunrpc: remove svc_rdma_bc_class sunrpc: remove svc_tcp_bc_class sunrpc: remove unused bc_up operation from rpc_xprt_ops sunrpc: replace svc_serv->sv_bc_xprt by boolean flag sunrpc: use-after-free in svc_process_common() sunrpc: use SVC_NET() in svcauth_gss_* functions nfsd: drop useless LIST_HEAD lockd: Show pid of lockd for remote locks NFSD remove OP_CACHEME from 4.2 op_flags nfsd: Return EPERM, not EACCES, in some SETATTR cases sunrpc: fix cache_head leak due to queued request nfsd: clean up indentation, increase indentation in switch statement svcrdma: Optimize the logic that selects the R_key to invalidate nfsd: fix a warning in __cld_pipe_upcall() nfsd4: fix crash on writing v4_end_grace before nfsd startup ...
2019-01-02SUNRPC: Fix some kernel doc complaintsChuck Lever
Clean up some warnings observed when building with "make W=1". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-27sunrpc: remove unused bc_up operation from rpc_xprt_opsVasily Averin
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-12-19SUNRPC: allow /proc entries without CONFIG_SUNRPC_DEBUGBen Dooks
If we want /proc/sys/sunrpc the current kernel also drags in other debug features which we don't really want. Instead, we should always show the following entries: /proc/sys/sunrpc/udp_slot_table_entries /proc/sys/sunrpc/tcp_slot_table_entries /proc/sys/sunrpc/tcp_max_slot_table_entries /proc/sys/sunrpc/min_resvport /proc/sys/sunrpc/max_resvport /proc/sys/sunrpc/tcp_fin_timeout Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Thomas Preston <thomas.preston@codethink.co.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2018-12-18SUNRPC: Fix a race with XPRT_CONNECTINGTrond Myklebust
Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that we don't have races between the (asynchronous) socket setup code and tasks in xprt_connect(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2018-12-18SUNRPC: Fix disconnection racesTrond Myklebust
When the socket is closed, we need to call xprt_disconnect_done() in order to clean up the XPRT_WRITE_SPACE flag, and wake up the sleeping tasks. However, we also want to ensure that we don't wake them up before the socket is closed, since that would cause thundering herd issues with everyone piling up to retransmit before the TCP shutdown dance has completed. Only the task that holds XPRT_LOCKED needs to wake up early in order to allow the close to complete. Reported-by: Dave Wysochanski <dwysocha@redhat.com> Reported-by: Scott Mayhew <smayhew@redhat.com> Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
2018-12-05SUNRPC: Don't force a redundant disconnection in xs_read_stream()Trond Myklebust
If the connection is broken, then xs_tcp_state_change() will take care of scheduling the socket close as soon as appropriate. xs_read_stream() just needs to report the error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-05SUNRPC: Fix up socket pollingTrond Myklebust
Ensure that we do not exit the socket read callback without clearing XPRT_SOCK_DATA_READY. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-05SUNRPC: Use the discard iterator rather than MSG_TRUNCTrond Myklebust
When discarding message data from the stream, we're better off using the discard iterator, since that will work with non-TCP streams. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-05SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request()Trond Myklebust
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-05SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flagTrond Myklebust
If the allocator fails before it has reached the target number of pages, then we need to recheck that we're not seeking past the page buffer. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2018-12-05SUNRPC: Fix RPC receive hangsTrond Myklebust
The RPC code is occasionally hanging when the receive code fails to empty the socket buffer due to a partial read of the data. When we convert that to an EAGAIN, it appears we occasionally leave data in the socket. The fix is to just keep reading until the socket returns EAGAIN/EWOULDBLOCK. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Cristian Marussi <cristian.marussi@arm.com> Reported-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com>