Re: [Qemu-devel] [PATCH 19/25] aio: add generic thread-pool facility

Stefan Hajnoczi Wed, 31 Oct 2012 06:45:19 -0700

On Wed, Oct 31, 2012 at 10:41 AM, Paolo Bonzini <pbonz...@redhat.com> wrote:
> Il 30/10/2012 20:13, Stefan Hajnoczi ha scritto:
>> On Fri, Oct 26, 2012 at 04:05:49PM +0200, Paolo Bonzini wrote:
>>> +static void event_notifier_ready(EventNotifier *notifier)
>>> +{
>>> +    ThreadPoolElement *elem, *next;
>>> +
>>> +    event_notifier_test_and_clear(notifier);
>>> +restart:
>>> +    QLIST_FOREACH_SAFE(elem, &head, all, next) {
>>> +        if (elem->state != THREAD_CANCELED && elem->state != THREAD_DONE) {
>>> +            continue;
>>> +        }
>>> +        if (elem->state == THREAD_DONE) {
>>> +            trace_thread_pool_complete(elem, elem->common.opaque, 
>>> elem->ret);
>>> +        }
>>> +        if (elem->state == THREAD_DONE && elem->common.cb) {
>>> +            QLIST_REMOVE(elem, all);
>>> +            elem->common.cb(elem->common.opaque, elem->ret);
>>
>> This function didn't take the lock.  First it accessed elem->state and
>> how it reads elem->ret.  We need to take the lock to ensure both
>> elem->state and elem->ret have been set - otherwise we could read
>> elem->ret before the return value was stored.
>
> Right.  posix-aio-compat didn't need this because it only had ret.
> Just as important: the locking policy was not documented at all.
>
> I'm applying some changes.  Logically (and for ease of review) they are
> four patches on top of this one, but they'll be squashed in the next
> submission.  (Hmm, the fourth should be separate).
>
>>> +typedef struct ThreadPoolCo {
>>> +    Coroutine *co;
>>> +    int ret;
>>> +} ThreadPoolCo;
>>> +
>>> +static void thread_pool_co_cb(void *opaque, int ret)
>>> +{
>>> +    ThreadPoolCo *co = opaque;
>>> +
>>> +    co->ret = ret;
>>> +    qemu_coroutine_enter(co->co, NULL);
>>> +}
>>> +
>>> +int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg)
>>> +{
>>> +    ThreadPoolCo tpc = { .co = qemu_coroutine_self(), .ret = -EINPROGRESS 
>>> };
>>> +    assert(qemu_in_coroutine());
>>> +    thread_pool_submit_aio(func, arg, thread_pool_co_cb, &tpc);
>>> +    qemu_coroutine_yield();
>>> +    return tpc.ret;
>>
>> It's important to understand that the submit_aio, yield, return ret
>> pattern works because we assume this function was called as part of the
>> main loop.
>>
>> If thread_pool_submit_co() was called outside the event loop and global
>> mutex, then there is a race between the submit_aio and yield steps where
>> thread_pool_co_cb() is called before this coroutine yields!
>
> Even before that, thread_pool_submit_aio would race on the
> non-thread-safe qemu_aio_get.  Also, head is protected by the BQL.
>
> event_notifier_ready needs to run under the BQL too, because it
> accesses head and also calls qemu_aio_release.
>
> qemu_aio_get and qemu_aio_release should be moved to AioContext, so
> that they can use the (upcoming) AioContext lock instead of the BQL.
> The thread pool needs to be per-AioContext instead of using globals,
> too.  However, this can be done later.


I just sent a patch to move from a manual freelist to g_slice_alloc().
 This does away with manual pooling and is thread-safe.  It's possible
that a lock + freelist is more efficient, but I think g_slice_alloc()
is a good start.

Your patches look good, looking forward to the next spin.

Stefan

Re: [Qemu-devel] [PATCH 19/25] aio: add generic thread-pool facility

Reply via email to