On Thu, Apr 19, 2018 at 4:08 PM, Jason Madden
<[email protected]> wrote:
> Hi all,
>
> With the support for fork merged and released last year, I've been
> able to start releasing gevent (Python coroutine [greenlet] networking
> library) with libuv support. Thanks!
>
> Earlier this year we discovered an issue [1] wherin a stopped check
> watcher kept getting called over and over in an infinite loop. This
> happened if there was more than one started check watcher when we
> entered uv__run_check, we stopped a check watcher in a check watcher
> callback, and a check watcher callback also incurred a greenlet
> switch.
>
> After some debugging it became clear what was going on. The gory
> details are in the bug report [2], but in summary this is an
> interaction between the facts that the QUEUE that is used to iterate
> through the check watchers[3] is stack allocated (local variables in
> the function uv__run_check function), and that switching greenlets
> saves and restores (portions of) the C stack[4] and that iterating
> through the queue temporarily stores the address of this
> stack-allocated queue into heap-allocated objects where they can be
> manipulated (e.g., by stopping a watcher). Combining switching
> greenlets with stopping watchers thus can result in the QUEUE getting
> restored to an invalid state (infinite loop in this case).
>
> Windows doesn't have this problem because there loop-watcher.c doesn't
> use a stack allocated QUEUE.
>
> We solved this problem by heap allocating the QUEUE[5] on each entry to
> uv__run_check so that it no longer got saved and restored by greenlet
> switches. This obviously adds some overhead, though I didn't try to
> quantify it.
>
> Potentially a better fix would be to have the QUEUE as part of the
> loop structure; I didn't do that here to keep the patch as small as
> possible (and I didn't know if that would hurt the chances of it
> getting upstreamed due to ABI concerns). Or maybe there are other
> approaches that are better? Feedback is welcome!
>
> Is a fix for this something that upstream would consider? I've seen
> other references to coroutine users and I don't know if they've seen
> anything like this, or maybe their coroutine libraries don't involve
> stack swapping. A potential negative to an upstream fix is that this
> is hard to test in pure C code.
>
> Thanks,
> Jason
>
> Footnotes:
> [1]  https://github.com/gevent/gevent/issues/1126
>
> [2]  https://github.com/gevent/gevent/issues/1126#issuecomment-369740928
>
> [3]  As well as prepare and idle watchers.
>
> [4]  https://github.com/python-greenlet/greenlet/blob/master/greenlet.c#L11
>
> [5]  
> https://github.com/gevent/gevent/commit/1ef16f38491000f4fdafdc01dc809d30fc785867

I suppose we could change that in libuv v2 (not v1 because of ABI) but
I suspect it would break again quickly because there is no good way to
regression-test it.  For instance, there's currently a pull request
under review that introduces a stack-allocated QUEUE in
uv_write()-related code.

-- 
You received this message because you are subscribed to the Google Groups 
"libuv" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/libuv.
For more options, visit https://groups.google.com/d/optout.

Reply via email to