> Am 03.03.2026 um 20:24 schrieb Peter Lieven <[email protected]>:
> 
> 
> 
>> Am 10.11.2025 um 16:48 schrieb Hanna Czenczek <[email protected]>:
>> 
>> Like in “rbd: Run co BH CB in the coroutine’s AioContext”, drop the
>> completion flag, yield exactly once, and run the BH in the coroutine’s
>> AioContext.
>> 
>> (Can be reproduced with multiqueue by adding a usleep(100000) before the
>> `while (!task.complete)` loops.)
>> 
>> Like in “iscsi: Run co BH CB in the coroutine’s AioContext”, this makes
>> nfs_co_generic_bh_cb() trivial, so we can drop it in favor of just
>> calling aio_co_wake() directly.
>> 
>> Cc: [email protected]
>> Signed-off-by: Hanna Czenczek <[email protected]>
>> ---
>> block/nfs.c | 41 ++++++++++++++++-------------------------
>> 1 file changed, 16 insertions(+), 25 deletions(-)
>> 
>> diff --git a/block/nfs.c b/block/nfs.c
>> index 0a7d38db09..1d3a34a30c 100644
>> --- a/block/nfs.c
>> +++ b/block/nfs.c
>> @@ -69,7 +69,6 @@ typedef struct NFSClient {
>> typedef struct NFSRPC {
>>     BlockDriverState *bs;
>>     int ret;
>> -    int complete;
>>     QEMUIOVector *iov;
>>     struct stat *st;
>>     Coroutine *co;
>> @@ -230,14 +229,6 @@ static void coroutine_fn 
>> nfs_co_init_task(BlockDriverState *bs, NFSRPC *task)
>>     };
>> }
>> 
>> -static void nfs_co_generic_bh_cb(void *opaque)
>> -{
>> -    NFSRPC *task = opaque;
>> -
>> -    task->complete = 1;
>> -    aio_co_wake(task->co);
>> -}
>> -
>> /* Called (via nfs_service) with QemuMutex held.  */
>> static void
>> nfs_co_generic_cb(int ret, struct nfs_context *nfs, void *data,
>> @@ -256,8 +247,16 @@ nfs_co_generic_cb(int ret, struct nfs_context *nfs, 
>> void *data,
>>     if (task->ret < 0) {
>>         error_report("NFS Error: %s", nfs_get_error(nfs));
>>     }
>> -    replay_bh_schedule_oneshot_event(task->client->aio_context,
>> -                                     nfs_co_generic_bh_cb, task);
>> +
>> +    /*
>> +     * Safe to call: nfs_service(), which called us, is only run from the FD
>> +     * handlers, never from the request coroutine.  The request coroutine in
>> +     * turn will yield unconditionally.
>> +     * No need to release the lock, even if we directly enter the 
>> coroutine, as
>> +     * the lock is never re-taken after yielding.  (Note: If we do enter the
>> +     * coroutine, @task will probably be dangling once aio_co_wake() 
>> returns.)
>> +     */
>> +    aio_co_wake(task->co);
>> }
>> 
>> static int coroutine_fn nfs_co_preadv(BlockDriverState *bs, int64_t offset,
>> @@ -278,9 +277,7 @@ static int coroutine_fn nfs_co_preadv(BlockDriverState 
>> *bs, int64_t offset,
>> 
>>         nfs_set_events(client);
>>     }
>> -    while (!task.complete) {
>> -        qemu_coroutine_yield();
>> -    }
>> +    qemu_coroutine_yield();
>> 
>>     if (task.ret < 0) {
>>         return task.ret;
>> @@ -328,9 +325,7 @@ static int coroutine_fn nfs_co_pwritev(BlockDriverState 
>> *bs, int64_t offset,
>> 
>>         nfs_set_events(client);
>>     }
>> -    while (!task.complete) {
>> -        qemu_coroutine_yield();
>> -    }
>> +    qemu_coroutine_yield();
>> 
>>     if (my_buffer) {
>>         g_free(buf);
>> @@ -358,9 +353,7 @@ static int coroutine_fn nfs_co_flush(BlockDriverState 
>> *bs)
>> 
>>         nfs_set_events(client);
>>     }
>> -    while (!task.complete) {
>> -        qemu_coroutine_yield();
>> -    }
>> +    qemu_coroutine_yield();
>> 
>>     return task.ret;
>> }
>> @@ -723,8 +716,8 @@ nfs_get_allocated_file_size_cb(int ret, struct 
>> nfs_context *nfs, void *data,
>>     if (task->ret < 0) {
>>         error_report("NFS Error: %s", nfs_get_error(nfs));
>>     }
>> -    replay_bh_schedule_oneshot_event(task->client->aio_context,
>> -                                     nfs_co_generic_bh_cb, task);
>> +    /* Safe to call, see nfs_co_generic_cb() */
>> +    aio_co_wake(task->co);
>> }
>> 
>> static int64_t coroutine_fn nfs_co_get_allocated_file_size(BlockDriverState 
>> *bs)
>> @@ -748,9 +741,7 @@ static int64_t coroutine_fn 
>> nfs_co_get_allocated_file_size(BlockDriverState *bs)
>> 
>>         nfs_set_events(client);
>>     }
>> -    while (!task.complete) {
>> -        qemu_coroutine_yield();
>> -    }
>> +    qemu_coroutine_yield();
>> 
>>     return (task.ret < 0 ? task.ret : st.st_blocks * 512);
>> }
>> -- 
>> 2.51.1
>> 
> 
> Hallo Hanna,
> 
> it has been a long time where I was inactive in Qemu development and I should 
> have checked this earlier, but it seems that this patch accidentally broke 
> libnfs usage with at least qemu cmdline tools like qemu-img. Again, sorry for 
> not testing this earlier, but I have only recently entered the Qemu show 
> again.
> 
> I found this glitch while working on a patch to include libnfs v6 support in 
> qemu to avoid dropping libnfs support.
> 
> A simple call like:
> 
> qemu-img create -f qcow2 nfs://nfsserver/image.qcow2 10G
> 
> hangs forever in the second write request.
> 
> I have not confirmed, but this might also hold true for libiscsi which 
> internally is quite identical (at least it was 3 years ago ;-))
> 
> You are right, the coroutine can only yield once, but it seems that the 
> bottom half logic is still needed.
> Maybe you have an idea why? I remember that we had this differently first and 
> implemented it after Stefan pointed out that it might be needed.
> I will try to find the thread about this discussion.

The BH for the iscsi driver was actually introduced here:

commit 8b9dfe9098d91e06a3dd6376624307fe5fa13be8
Author: Peter Lieven <[email protected]>
Date:   Sat Dec 14 17:31:40 2013 +0100

    block/iscsi: use a bh to schedule co reentrance

I am not 100% sure if the justification for this commit is still the same or if 
it was fixed by switching to aio_co_wake anyway.

I meanwhile tested libiscsi support and it seems not to hang, but it might 
still have a performance impact (maybe).

If someone has a clue I could go ahead and send the libnfs v6 support patches.
The hang happens also with older libnfs versions so it has nothing to do with 
libnfs v6.

Best
Peter


Reply via email to