summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-31svcrdma: Remove svc_rdma_recv_ctxt::rc_pages and ::rc_argChuck Lever
These fields are no longer used. The size of struct svc_rdma_recv_ctxt is now less than 300 bytes on x86_64, down from 2440 bytes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-31svcrdma: Remove sc_read_complete_qChuck Lever
Now that svc_rdma_recvfrom() waits for Read completion, sc_read_complete_q is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-31svcrdma: Single-stage RDMA ReadChuck Lever
Currently the generic RPC server layer calls svc_rdma_recvfrom() twice to retrieve an RPC message that uses Read chunks. I'm not exactly sure why this design was chosen originally. Instead, let's wait for the Read chunk completion inline in the first call to svc_rdma_recvfrom(). The goal is to eliminate some page allocator churn. rdma_read_complete() replaces pages in the second svc_rqst by calling put_page() repeatedly while the upper layer waits for the request to be constructed, which adds unnecessary NFS WRITE round- trip latency. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Tom Talpey <tom@talpey.com>
2021-03-22SUNRPC: Move svc_xprt_received() call sitesChuck Lever
Currently, XPT_BUSY is not cleared until xpo_recvfrom returns. That effectively blocks the receipt and handling of the next RPC message until the current one has been taken off the transport. This strict ordering is a requirement for socket transports. For our kernel RPC/RDMA transport implementation, however, dequeuing an ingress message is nothing more than a list_del(). The transport can safely be marked un-busy as soon as that is done. To keep the changes simpler, this patch just moves the svc_xprt_received() call site from svc_handle_xprt() into the transports, so that the actual optimization can be done in a subsequent patch. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22SUNRPC: Export svc_xprt_received()Chuck Lever
Prepare svc_xprt_received() to be called from transport code instead of from generic RPC server code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Retain the page backing rq_res.head[0].iov_baseChuck Lever
svc_rdma_sendto() now waits for the NIC hardware to finish with the pages backing rq_res. We still have to release the page array in some cases, but now it's always safe to immediately re-use the page backing rq_res's head buffer. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Remove unused sc_pages fieldChuck Lever
Clean up. This significantly reduces the size of struct svc_rdma_send_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Normalize Send page handlingChuck Lever
Currently svc_rdma_sendto() migrates xdr_buf pages into a separate page list and NULLs out a bunch of entries in rq_pages while the pages are under I/O. The Send completion handler then frees those pages later. Instead, let's wait for the Send completion, then handle page releasing in the nfsd thread. I'd like to avoid the cost of 250+ put_page() calls in the Send completion handler, which is single- threaded. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Add a "deferred close" helperChuck Lever
Refactor a bit of commonly used logic so that every site that wants a close deferred to an nfsd thread does all the right things (set_bit(XPT_CLOSE) then enqueue). Also, once XPT_CLOSE is set on a transport, it is never cleared. If XPT_CLOSE is already set, then the close is already being handled and the enqueue can be skipped. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Maintain a Receive water markChuck Lever
Post more Receives when the number of pending Receives drops below a water mark. The batch mechanism is disabled if the underlying device cannot support a reasonably-sized Receive Queue. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Use svc_rdma_refresh_recvs() in wc_receiveChuck Lever
Replace svc_rdma_post_recv() with the new batch receive mechanism. For the moment it is posting just a single Receive WR at a time, so no change in behavior is expected. Since svc_rdma_wc_receive() was the last call site for svc_rdma_post_recv(), it is removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Add a batch Receive posting mechanismChuck Lever
Introduce a server-side mechanism similar to commit e340c2d6ef2a ("xprtrdma: Reduce the doorbell rate (Receive)") to post Receive WRs in batch. Its first consumer is svc_rdma_post_recvs(), which posts the initial set of Receive WRs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Remove stale comment for svc_rdma_wc_receive()Chuck Lever
xprt pinning was removed in commit 365e9992b90f ("svcrdma: Remove transport reference counting"), but this comment was not updated to reflect that change. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: Provide an explanatory comment in CMA event handlerChuck Lever
Clean up: explain why svc_xprt_enqueue() is invoked in the event handler even though no xpt_flags bits are toggled here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22svcrdma: RPCDBG_FACILITY is no longer usedChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: report client confirmation status in "info" fileNeilBrown
mountd can now monitor clients appearing and disappearing in /proc/fs/nfsd/clients, and will log these events, in liu of the logging of mount/unmount events for NFSv3. Currently it cannot distinguish between unconfirmed clients (which might be transient and totally uninteresting) and confirmed clients. So add a "status: " line which reports either "confirmed" or "unconfirmed", and use fsnotify to report that the info file has been modified. This requires a bit of infrastructure to keep the dentry for the "info" file. There is no need to take a counted reference as the dentry must remain around until the client is removed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: don't ignore high bits of copy countJ. Bruce Fields
Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined by the protocol to be 64 bit, so we could be turning a large copy into a 0-length copy here. Reported-by: <radchenkoy@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: COPY with length 0 should copy to end of fileJ. Bruce Fields
>From https://tools.ietf.org/html/rfc7862#page-65 A count of 0 (zero) requests that all bytes from ca_src_offset through EOF be copied to the destination. Reported-by: <radchenkoy@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: Fix typo "accesible"Ricardo Ribalda
Trivial fix. Cc: linux-nfs@vger.kernel.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmountedTrond Myklebust
In order to ensure that knfsd threads don't linger once the nfsd pseudofs is unmounted (e.g. when the container is killed) we let nfsd_umount() shut down those threads and wait for them to exit. This also should ensure that we don't need to do a kernel mount of the pseudofs, since the thread lifetime is now limited by the lifetime of the filesystem. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: Log client tracking type log message as info instead of warningPaul Menzel
`printk()`, by default, uses the log level warning, which leaves the user reading NFSD: Using UMH upcall client tracking operations. wondering what to do about it (`dmesg --level=warn`). Several client tracking methods are tried, and expected to fail. That’s why a message is printed only on success. It might be interesting for users to know the chosen method, so use info-level instead of debug-level. Cc: linux-nfs@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22nfsd: helper for laundromat expiry calculationsJ. Bruce Fields
We do this same logic repeatedly, and it's easy to get the sense of the comparison wrong. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Clean up NFSDDBG_FACILITY macroChuck Lever
These are no longer needed because there are no dprintk() call sites in these files. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Add a tracepoint to record directory entry encodingChuck Lever
Enable watching the progress of directory encoding to capture the timing of any issues with reading or encoding a directory. The new tracepoint captures dirent encoding for all NFS versions. For example, here's what a few NFSv4 directory entries might look like: nfsd-989 [002] 468.596265: nfsd_dirent: fh_hash=0x5d162594 ino=2 name=. nfsd-989 [002] 468.596267: nfsd_dirent: fh_hash=0x5d162594 ino=1 name=.. nfsd-989 [002] 468.596299: nfsd_dirent: fh_hash=0x5d162594 ino=3827 name=zlib.c nfsd-989 [002] 468.596325: nfsd_dirent: fh_hash=0x5d162594 ino=3811 name=xdiff nfsd-989 [002] 468.596351: nfsd_dirent: fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h nfsd-989 [002] 468.596377: nfsd_dirent: fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Clean up after updating NFSv3 ACL encodersChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Clean up after updating NFSv2 ACL encodersChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_streamChuck Lever
The SETACL result encoder is exactly the same as the NFSv2 attrstatres decoder. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLsChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Remove unused NFSv2 directory entry encodersChuck Lever
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Count bytes instead of pages in the NFSv2 READDIR encoderChuck Lever
Clean up: Counting the bytes used by each returned directory entry seems less brittle to me than trying to measure consumed pages after the fact. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Add a helper that encodes NFSv3 directory offset cookiesChuck Lever
Refactor: Add helper function similar to nfs3svc_encode_cookie3(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 READ result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 diropres encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 attrstat encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv2 stat encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Reduce svc_rqst::rq_pages churn during READDIR operationsChuck Lever
During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances rq_next_page to the full size of the client-requested buffer, then releases all those pages at the end of the request. The next request to use that nfsd thread has to refill the pages. NFSD does this even when the dirlist in the reply is small. With NFSv3 clients that send READDIR operations with large buffer sizes, that can be 256 put_page/alloc_page pairs per READDIR request, even though those pages often remain unused. We can save some work by not releasing dirlist buffer pages that were not used to form the READDIR Reply. I've left the NFSv2 code alone since there are never more than three pages involved in an NFSv2 READDIR Reply. Eventually we should nail down why these pages need to be released at all in order to avoid allocating and releasing pages unnecessarily. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Remove unused NFSv3 directory entry encodersChuck Lever
Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_streamChuck Lever
The benefit of the xdr_stream helpers is that they transparently handle encoding an XDR data item that crosses page boundaries. Most of the open-coded logic to do that here can be eliminated. A sub-buffer and sub-stream are set up as a sink buffer for the directory entry encoder. As an entry is encoded, it is added to the end of the content in this buffer/stream. The total length of the directory list is tracked in the buffer's @len field. When it comes time to encode the Reply, the sub-buffer is merged into rq_res's page array at the correct place using xdr_write_pages(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_streamChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Count bytes instead of pages in the NFSv3 READDIR encoderChuck Lever
Clean up: Counting the bytes used by each returned directory entry seems less brittle to me than trying to measure consumed pages after the fact. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2021-03-22NFSD: Add a helper that encodes NFSv3 directory offset cookiesChuck Lever
Refactor: De-duplicate identical code that handles encoding of directory offset cookies across page boundaries. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>