> Am 06.03.2026 um 14:41 schrieb Hanna Czenczek <[email protected]>: > > On 03.03.26 20:24, Peter Lieven wrote: >> >> >>> Am 10.11.2025 um 16:48 schrieb Hanna Czenczek <[email protected]> >>> <mailto:[email protected]>: >>> >>> Like in “rbd: Run co BH CB in the coroutine’s AioContext”, drop the >>> completion flag, yield exactly once, and run the BH in the coroutine’s >>> AioContext. >>> >>> (Can be reproduced with multiqueue by adding a usleep(100000) before the >>> `while (!task.complete)` loops.) >>> >>> Like in “iscsi: Run co BH CB in the coroutine’s AioContext”, this makes >>> nfs_co_generic_bh_cb() trivial, so we can drop it in favor of just >>> calling aio_co_wake() directly. >>> >>> Cc: [email protected] <mailto:[email protected]> >>> Signed-off-by: Hanna Czenczek <[email protected]> <mailto:[email protected]> >>> --- >>> block/nfs.c | 41 ++++++++++++++++------------------------- >>> 1 file changed, 16 insertions(+), 25 deletions(-) >>> >>> diff --git a/block/nfs.c b/block/nfs.c >>> index 0a7d38db09..1d3a34a30c 100644 >>> --- a/block/nfs.c >>> +++ b/block/nfs.c >>> @@ -69,7 +69,6 @@ typedef struct NFSClient { >>> typedef struct NFSRPC { >>> BlockDriverState *bs; >>> int ret; >>> - int complete; >>> QEMUIOVector *iov; >>> struct stat *st; >>> Coroutine *co; >>> @@ -230,14 +229,6 @@ static void coroutine_fn >>> nfs_co_init_task(BlockDriverState *bs, NFSRPC *task) >>> }; >>> } >>> >>> -static void nfs_co_generic_bh_cb(void *opaque) >>> -{ >>> - NFSRPC *task = opaque; >>> - >>> - task->complete = 1; >>> - aio_co_wake(task->co); >>> -} >>> - >>> /* Called (via nfs_service) with QemuMutex held. */ >>> static void >>> nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data, >>> @@ -256,8 +247,16 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, >>> void *data, >>> if (task->ret < 0) { >>> error_report("NFS Error: %s", nfs_get_error(nfs)); >>> } >>> - replay_bh_schedule_oneshot_event(task->client->aio_context, >>> - nfs_co_generic_bh_cb, task); >>> + >>> + /* >>> + * Safe to call: nfs_service(), which called us, is only run from the >>> FD >>> + * handlers, never from the request coroutine. The request coroutine >>> in >>> + * turn will yield unconditionally. >>> + * No need to release the lock, even if we directly enter the >>> coroutine, as >>> + * the lock is never re-taken after yielding. (Note: If we do enter >>> the >>> + * coroutine, @task will probably be dangling once aio_co_wake() >>> returns.) >>> + */ >>> + aio_co_wake(task->co); >>> } >>> >>> static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset, >>> @@ -278,9 +277,7 @@ static int coroutine_fn nfs_co_preadv(BlockDriverState >>> *bs, int64_t offset, >>> >>> nfs_set_events(client); >>> } >>> - while (!task.complete) { >>> - qemu_coroutine_yield(); >>> - } >>> + qemu_coroutine_yield(); >>> >>> if (task.ret < 0) { >>> return task.ret; >>> @@ -328,9 +325,7 @@ static int coroutine_fn nfs_co_pwritev(BlockDriverState >>> *bs, int64_t offset, >>> >>> nfs_set_events(client); >>> } >>> - while (!task.complete) { >>> - qemu_coroutine_yield(); >>> - } >>> + qemu_coroutine_yield(); >>> >>> if (my_buffer) { >>> g_free(buf); >>> @@ -358,9 +353,7 @@ static int coroutine_fn nfs_co_flush(BlockDriverState >>> *bs) >>> >>> nfs_set_events(client); >>> } >>> - while (!task.complete) { >>> - qemu_coroutine_yield(); >>> - } >>> + qemu_coroutine_yield(); >>> >>> return task.ret; >>> } >>> @@ -723,8 +716,8 @@ nfs_get_allocated_file_size_cb(int ret, struct >>> nfs_context *nfs, void *data, >>> if (task->ret < 0) { >>> error_report("NFS Error: %s", nfs_get_error(nfs)); >>> } >>> - replay_bh_schedule_oneshot_event(task->client->aio_context, >>> - nfs_co_generic_bh_cb, task); >>> + /* Safe to call, see nfs_co_generic_cb() */ >>> + aio_co_wake(task->co); >>> } >>> >>> static int64_t coroutine_fn nfs_co_get_allocated_file_size(BlockDriverState >>> *bs) >>> @@ -748,9 +741,7 @@ static int64_t coroutine_fn >>> nfs_co_get_allocated_file_size(BlockDriverState *bs) >>> >>> nfs_set_events(client); >>> } >>> - while (!task.complete) { >>> - qemu_coroutine_yield(); >>> - } >>> + qemu_coroutine_yield(); >>> >>> return (task.ret < 0 ? task.ret : st.st_blocks * 512); >>> } >>> -- >>> 2.51.1 >>> >> >> Hallo Hanna, >> >> it has been a long time where I was inactive in Qemu development and I >> should have checked this earlier, but it seems that this patch accidentally >> broke libnfs usage with at least qemu cmdline tools like qemu-img. Again, >> sorry for not testing this earlier, but I have only recently entered the >> Qemu show again. >> >> I found this glitch while working on a patch to include libnfs v6 support in >> qemu to avoid dropping libnfs support. >> >> A simple call like: >> >> qemu-img create -f qcow2 nfs://nfsserver/image.qcow2 10G >> >> hangs forever in the second write request. > > I sent a patch for this, which is basically the same as yours below: > https://lists.nongnu.org/archive/html/qemu-block/2026-01/msg00000.html
Thanks, I will rebase by NFS patches for libnfs v6 on top of that. Do we need the same fix for iSCSI? Peter
