On 10.11.22 15:01, Kevin Wolf wrote:
Am 07.11.2022 um 16:13 hat Hanna Reitz geschrieben:
Hi,
v1 cover letter:
https://lists.nongnu.org/archive/html/qemu-block/2022-09/msg00389.html
bdrv_replace_child_noperm() drains the child via
bdrv_parent_drained_{begin,end}_single(). When it removes a child, the
bdrv_parent_drained_end_single() at its end will be called on an empty
child, making the BDRV_POLL_WHILE() in it poll the main AioContext
(because c->bs is NULL).
That’s wrong, though, because it’s supposed to operate on the parent.
bdrv_parent_drained_end_single_no_poll() will have scheduled any BHs in
the parents’ AioContext, which may be anything, not necessarily the main
context. Therefore, we must poll the parent’s context.
Patch 3 does this for both bdrv_parent_drained_{begin,end}_single().
Patch 1 ensures that we can legally call
bdrv_child_get_parent_aio_context() from those I/O context functions,
and patch 2 fixes blk_do_set_aio_context() to not cause an assertion
failure if it beginning a drain can end up in blk_get_aio_context()
before blk->ctx has been updated.
Thanks, applied to the block branch.
Thanks!
I tested your drain series, and it does indeed fix the bug, too. (Sorry
for the delay, I thought it’d take less time to write an iotest...)
I would still be interested in a test case as a follow-up.
Got it working now and sent as “tests/stream-under-throttle: New test”.
Hanna