On 02/04/20 04:44, Ying Fang wrote: > Normal VM runtime is not affected by this hang since there is always some > timer timeout or subsequent io worker come and notify the main thead. > To fix this problem, a memory barrier is added to aio_ctx_prepare and > it is proved to have the hang fixed in our test.
Hi Ying Fang, this part of the patch is correct, but I am not sure if a memory barrier is needed in aio_poll too. In addition, the memory barrier is quite slow on x86 and not needed there. I am sorry for dropping the ball on this bug; I had a patch using relaxed atomics (atomic_set/atomic_read) but I never submitted it because I had placed it in a larger series. Let me post it now. Thanks, Paolo > > diff --git a/util/async.c b/util/async.c > index b94518b..89a4f3e 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -250,7 +250,8 @@ aio_ctx_prepare(GSource *source, gint *timeout) > AioContext *ctx = (AioContext *) source; > > atomic_or(&ctx->notify_me, 1); > - > + /* Make sure notify_me is set before aio_compute_timeout */ > + smp_mb(); > /* We assume there is no timeout already supplied */ > *timeout = qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)); > >
