On Thu, Oct 13, 2016 at 07:34:19PM +0200, Paolo Bonzini wrote: > aio_poll is not thread safe; for example bdrv_drain can hang if > the last in-flight I/O operation is completed in the I/O thread after > the main thread has checked bs->in_flight. > > The bug remains latent as long as all of it is called within > aio_context_acquire/aio_context_release, but this will change soon. > > To fix this, if bdrv_drain is called from outside the I/O thread, > signal the main AioContext through a dummy bottom half. The event > loop then only runs in the I/O thread. > > Reviewed-by: Fam Zheng <[email protected]> > Signed-off-by: Paolo Bonzini <[email protected]> > --- > async.c | 1 + > block.c | 2 ++ > block/io.c | 7 +++++++ > include/block/block.h | 24 +++++++++++++++++++++--- > include/block/block_int.h | 1 + > 5 files changed, 32 insertions(+), 3 deletions(-) > > diff --git a/async.c b/async.c > index f30d011..fb37b03 100644 > --- a/async.c > +++ b/async.c > @@ -61,6 +61,7 @@ void aio_bh_schedule_oneshot(AioContext *ctx, QEMUBHFunc > *cb, void *opaque) > smp_wmb(); > ctx->first_bh = bh; > qemu_mutex_unlock(&ctx->bh_lock); > + aio_notify(ctx); > } > > QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque) > diff --git a/block.c b/block.c > index fbe485c..a17baab 100644 > --- a/block.c > +++ b/block.c > @@ -2090,7 +2090,9 @@ int bdrv_reopen_multiple(AioContext *ctx, > BlockReopenQueue *bs_queue, Error **er > > assert(bs_queue != NULL); > > + aio_context_release(ctx); > bdrv_drain_all(); > + aio_context_acquire(ctx); > > QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) { > if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, &local_err)) { > diff --git a/block/io.c b/block/io.c > index 7d3dcfc..cd28909 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -474,8 +474,15 @@ void bdrv_inc_in_flight(BlockDriverState *bs) > atomic_inc(&bs->in_flight); > } > > +static void dummy_bh_cb(void *opaque) > +{ > +} > + > void bdrv_wakeup(BlockDriverState *bs) > { > + if (bs->wakeup) { > + aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL); > + } > }
Why use a dummy BH instead of aio_notify()?
>
> void bdrv_dec_in_flight(BlockDriverState *bs)
> diff --git a/include/block/block.h b/include/block/block.h
> index ba4318b..72d5d8e 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -343,9 +343,27 @@ void bdrv_drain_all(void);
> #define bdrv_poll_while(bs, cond) ({ \
> bool waited_ = false; \
> BlockDriverState *bs_ = (bs); \
> - while ((cond)) { \
> - aio_poll(bdrv_get_aio_context(bs_), true); \
> - waited_ = true; \
> + AioContext *ctx_ = bdrv_get_aio_context(bs_); \
> + if (aio_context_in_iothread(ctx_)) { \
> + while ((cond)) { \
> + aio_poll(ctx_, true); \
> + waited_ = true; \
> + } \
> + } else { \
> + assert(qemu_get_current_aio_context() == \
> + qemu_get_aio_context()); \
The assumption is that IOThread #1 will never call bdrv_poll_while() on
IOThread #2's AioContext. I believe that is true today. Is this what
you had in mind?
Please add a comment since it's not completely obvious from the assert
expression.
> + /* Ask bdrv_dec_in_flight to wake up the main \
> + * QEMU AioContext. \
> + */ \
> + assert(!bs_->wakeup); \
> + bs_->wakeup = true; \
> + while ((cond)) { \
> + aio_context_release(ctx_); \
> + aio_poll(qemu_get_aio_context(), true); \
> + aio_context_acquire(ctx_); \
> + waited_ = true; \
> + } \
> + bs_->wakeup = false; \
> } \
> waited_; })
>
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 11f877b..0516f62 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -470,6 +470,7 @@ struct BlockDriverState {
> NotifierWithReturnList before_write_notifiers;
>
> /* number of in-flight requests; overall and serialising */
> + bool wakeup;
> unsigned int in_flight;
> unsigned int serialising_in_flight;
>
> --
> 2.7.4
>
>
>
signature.asc
Description: PGP signature
