Il 09/10/2012 17:37, Anthony Liguori ha scritto:
>>> >> In the very short term, I can imagine an aio fastpath that was only
>>> >> implemented in terms of the device API.  We could have a slow path that
>>> >> acquired the BQL.
>> >
>> > Not sure I follow.
> 
> As long as the ioeventfd thread can acquire qemu_mutex in order to call
> bdrv_* functions.  The new device-only API could do this under the
> covers for everything but the linux-aio fast path initially.

Ok, so it's about the locking.  I'm not even sure we need locking if we
have cooperative multitasking.  For example if bdrv_aio_readv/writev
is called from a VCPU thread, it can just schedule a bottom half for
itself in the appropriate AioContext.  Similarly for block jobs.

The only part where I'm not sure how it would work is bdrv_read/write,
because of the strange "qemu_aio_wait() calls select with a lock taken".
Maybe we can just forbid synchronous I/O if you set a non-default
AioContext.

This would be entirely hidden in the block layer.  For example the
following does it for bdrv_aio_readv/writev:

diff --git a/block.c b/block.c
index e95f613..7165e82 100644
--- a/block.c
+++ b/block.c
@@ -3712,15 +3712,6 @@ static AIOPool bdrv_em_co_aio_pool = {
     .cancel             = bdrv_aio_co_cancel_em,
 };
 
-static void bdrv_co_em_bh(void *opaque)
-{
-    BlockDriverAIOCBCoroutine *acb = opaque;
-
-    acb->common.cb(acb->common.opaque, acb->req.error);
-    qemu_bh_delete(acb->bh);
-    qemu_aio_release(acb);
-}
-
 /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
 static void coroutine_fn bdrv_co_do_rw(void *opaque)
 {
@@ -3735,8 +3726,17 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
             acb->req.nb_sectors, acb->req.qiov, 0);
     }
 
-    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
-    qemu_bh_schedule(acb->bh);
+    acb->common.cb(acb->common.opaque, acb->req.error);
+    qemu_aio_release(acb);
+}
+
+static void bdrv_co_em_bh(void *opaque)
+{
+    BlockDriverAIOCBCoroutine *acb = opaque;
+
+    qemu_bh_delete(acb->bh);
+    co = qemu_coroutine_create(bdrv_co_do_rw);
+    qemu_coroutine_enter(co, acb);
 }
 
 static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
@@ -3756,8 +3756,8 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
     acb->req.qiov = qiov;
     acb->is_write = is_write;
 
-    co = qemu_coroutine_create(bdrv_co_do_rw);
-    qemu_coroutine_enter(co, acb);
+    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
+    qemu_bh_schedule(acb->bh);
 
     return &acb->common;
 }


Then we can add a bdrv_aio_readv/writev_unlocked API to the protocols, which
would run outside the bottom half and provide the desired fast path.

Paolo

> That means that we can convert block devices to use the device-only API
> across the board (provided we make BQL recursive).
> 
> It also means we get at least some of the benefits of data-plane in the
> short term.


Reply via email to