[sheepdog] [PATCH v3 12/17] block/block-backend: convert blk io path to use int64_t parameters
Eric Blake
eblake at redhat.com
Wed Jun 24 00:11:06 CEST 2020
On 4/30/20 6:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> We are generally moving to int64_t for both offset and bytes parameters
> on all io paths.
>
> Main motivation is realization of 64-bit write_zeroes operation for
> fast zeroing large disk chunks, up to the whole disk.
>
> We chose signed type, to be consistent with off_t (which is signed) and
> with possibility for signed return type (where negative value means
> error).
>
> Now bdrv layer is converted, convert blk layer too.
In fact, I just discovered thanks to
https://bugs.launchpad.net/qemu/+bug/1884831 that NBD is a case of a
client that can currently pass values larger than 2G into
blk_co_pdiscard() which in turn appears as a negative value and instant
EIO failure. So this is a bug fix visible to NBD clients.
$ gdb --args ./qemu-nbd --trace=nbd_\* -f raw f --port 10810
...
(gdb) b blk_co_pdiscard
(gdb) r
...
$ nbdsh -u nbd://localhost:10810 -c 'h.trim(3*1024*1024*1024,0)'
...
Thread 1 "qemu-nbd" hit Breakpoint 3, blk_co_pdiscard (blk=0x555555832dc0,
offset=0, bytes=-1073741824)
Looks like I now have even more reason to accelerate my review of the
remainder of this series, and to take some (if not all) of it through
the NBD tree.
> +++ b/include/sysemu/block-backend.h
> @@ -119,14 +119,14 @@ BlockBackend *blk_by_dev(void *dev);
> BlockBackend *blk_by_qdev_id(const char *id, Error **errp);
> void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops, void *opaque);
> int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
> - unsigned int bytes, QEMUIOVector *qiov,
> + int64_t bytes, QEMUIOVector *qiov,
> BdrvRequestFlags flags);
> int coroutine_fn blk_co_pwritev_part(BlockBackend *blk, int64_t offset,
> - unsigned int bytes,
> + int64_t bytes,
> QEMUIOVector *qiov, size_t qiov_offset,
> BdrvRequestFlags flags);
> int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
> - unsigned int bytes, QEMUIOVector *qiov,
> + int64_t bytes, QEMUIOVector *qiov,
> BdrvRequestFlags flags);
pread and pwrite weren't necessarily problems for NBD (since our NBD
implementation caps things to 32M per packet).
>
> static inline int coroutine_fn blk_co_pread(BlockBackend *blk, int64_t offset,
> @@ -148,13 +148,13 @@ static inline int coroutine_fn blk_co_pwrite(BlockBackend *blk, int64_t offset,
> }
>
> int blk_pwrite_zeroes(BlockBackend *blk, int64_t offset,
> - int bytes, BdrvRequestFlags flags);
> + int64_t bytes, BdrvRequestFlags flags);
> BlockAIOCB *blk_aio_pwrite_zeroes(BlockBackend *blk, int64_t offset,
> - int bytes, BdrvRequestFlags flags,
> + int64_t bytes, BdrvRequestFlags flags,
> BlockCompletionFunc *cb, void *opaque);
But this change to writing zeroes,
> int blk_make_zero(BlockBackend *blk, BdrvRequestFlags flags);
> -int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int bytes);
> -int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int bytes,
> +int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int64_t bytes);
> +int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int64_t bytes,
> BdrvRequestFlags flags);
> int64_t blk_getlength(BlockBackend *blk);
> void blk_get_geometry(BlockBackend *blk, uint64_t *nb_sectors_ptr);
> @@ -167,14 +167,14 @@ BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset,
> BlockCompletionFunc *cb, void *opaque);
> BlockAIOCB *blk_aio_flush(BlockBackend *blk,
> BlockCompletionFunc *cb, void *opaque);
> -BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int bytes,
> +BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t bytes,
> BlockCompletionFunc *cb, void *opaque);
and this change to discard are definitely both bug fixes for NBD
clients, especially now that we have a real-world case of a client
(namely the blkdiscard app triggering ioctl(BLKDISCARD) handling through
nbd.ko as client) that actually triggers a >2G trim request.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
More information about the sheepdog
mailing list