On 03/10/2016 10:03 AM, Christian Borntraeger wrote:
> On 03/10/2016 02:51 AM, Fam Zheng wrote:
> [...]
>> The aio_poll() inside "blk_set_aio_context(s->conf->conf.blk, s->ctx)" looks
>> suspicious:
>>
>> main thread iothread
>> ----------------------------------------------------------------------------
>> virtio_blk_handle_output()
>> virtio_blk_data_plane_start()
>> vblk->dataplane_started = true;
>> blk_set_aio_context()
>> bdrv_set_aio_context()
>> bdrv_drain()
>> aio_poll()
>> <snip...>
>> virtio_blk_handle_output()
>> /* s->dataplane_started is true */
>> !!! -> virtio_blk_handle_request()
>> event_notifier_set(ioeventfd)
>> aio_poll()
>>
>> virtio_blk_handle_request()
>>
>> Christian, could you try the followed patch? The aio_poll above is replaced
>> with a "limited aio_poll" that doesn't disptach ioeventfd.
>>
>> (Note: perhaps moving "vblk->dataplane_started = true;" after
>> blk_set_aio_context() also *works around* this.)
>>
>> ---
>>
>> diff --git a/block.c b/block.c
>> index ba24b8e..e37e8f7 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
>>
>> void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
>> {
>> - bdrv_drain(bs); /* ensure there are no in-flight requests */
>> + /* ensure there are no in-flight requests */
>> + bdrv_drained_begin(bs);
>> + bdrv_drained_end(bs);
>>
>> bdrv_detach_aio_context(bs);
>>
>
> That seems to do the trick.
Or not. Crashed again :-(
here is a trace with debugging enabled. The opaque value is zero, which is not
good.
#0 0x0000000010329f98 in bdrv_co_do_rw (opaque=0x0) at block/io.c:2170
#1 0x00000000103b33a2 in coroutine_trampoline (i0=1023, i1=1946159824) at
qemu/util/coroutine-ucontext.c:79
#2 0x000003ff7d9d150a in __makecontext_ret () from /lib64/libc.so.6
Still no idea why.