netfs: Combine prepare and issue ops and grab the buffers on request
To try and simplify how subrequests are generated in netfslib, with the
move to bvecq for buffer handling, change netfslib in the following ways:
(1) ->prepare_xxx(), buffer selection and ->issue_xxx() are now collapsed
together such that one ->issue_xxx() call is made with the subrequest
defined to the maximum extent; the filesystem then reduces the length
of the subrequest and calls back to netfslib to grab a slice of the
buffer, which may reduce the subrequest further if a maximum segment
limit is set. The filesystem can then dispatch the operation.
(2) To allow buffer slicing to be done upon request by the filesystem, a
dispatch context is now maintained by netfslib and this is passed to
->issue_xxx() which then calls netfs_prepare_xxx_buffer(). This also
permits the context for retry to be kept separate from that of initial
dispatch.
(3) The use of iov_iter is pushed down to the filesystem. Netfslib now
provides the filesystem with a bvecq holding the buffer rather than an
iov_iter. The bvecq can be duplicated and headers/trailers attached
to hold protocol and several bvecqs can be linked together to create a
compound operation.
(4) The ->issue_xxx() functions now return an error code that allows them
to return an error without having to terminate the subrequest.
Netfslib will handle the error immediately if it can but may request
termination and punt responsibility to the result collector.
->issue_xxx() can return 0 if synchronously compete and -EIOCBQUEUED
if the operation will complete (or already has completed)
asynchronously.
(5) During writeback, the code now builds up an accumulation of buffered
data before issuing writes on each stream (one server, one cache). It
asks each stream for an estimate of how much data to accumulate before
it starts generating subrequests on the stream. It is not required to
use up all the data accumulated on a stream at that time unless we hit
the end of the pagecache.
(6) During read-gaps, in which there are two gaps on either end of a dirty
streaming write page that need to be filled, a buffer is constructed
consisting of the two ends plus a sink page repeated to cover the
middle portion. This is passed to the server as a single write. For
something like Ceph, this should probably be done either as a
vectored/sparse read or as two separate reads (if different Ceph
objects are involved).
(7) During unbuffered/DIO read/write, there is a single contiguous file
region to be written or read as a single stream. The dispatching
function just creates subrequests and calls ->issue_xxx() repeatedly
to eat through the bufferage.
(8) During buffered read, there is a single contiguous file region, to
read as a single stream - however, this stream may be stitched
together from subrequests to multiple sources. Which sources are used
where is now determined by querying the cache to find the next couple
of extents in which it has data; netfslib uses this to direct the
subrequests towards the appropriate sources.
Each subrequest is given the maximum length in the current extent and
then ->issue_read() is called. The filesystem then limits the size
and slices off a piece of the buffer for that extent.
(9) The cache now uses fiemap internally to find out the occupied regions
of a cachefile rather than SEEK_DATA/SEEK_HOLE. In future, it should
keep track of the regions itself - including regions of zeros.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: Christoph Hellwig <hch@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
35 files changed