On 9 September 2014 13:10, Marek Chalupa <[email protected]> wrote:
> > > On 4 September 2014 15:17, Pekka Paalanen <[email protected]> wrote: > >> On Fri, 29 Aug 2014 11:21:30 +0200 >> Marek Chalupa <[email protected]> wrote: >> >> > When wl_display_read_events() returns with errno == EAGAIN, we >> > naturally try to call it again. But this next call results in deadlock. >> > >> > Signed-off-by: Marek Chalupa <[email protected]> >> > --- >> > tests/display-test.c | 88 >> ++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > 1 file changed, 88 insertions(+) >> > >> > diff --git a/tests/display-test.c b/tests/display-test.c >> > index 1289866..e1cf830 100644 >> > --- a/tests/display-test.c >> > +++ b/tests/display-test.c >> > @@ -32,6 +32,7 @@ >> > #include <sys/types.h> >> > #include <sys/stat.h> >> > #include <pthread.h> >> > +#include <poll.h> >> > >> > #include "wayland-private.h" >> > #include "wayland-server.h" >> > @@ -480,6 +481,93 @@ TEST(threading_cancel_read_tst) >> > display_destroy(d); >> > } >> > >> > +/* set this to 1 when it is right that >> > + * the thread is woken up */ >> > +static int threads_can_be_woken_up = 0; >> > + >> > +static void * >> > +thread_eagain_func(void *data) { >> > + struct client *c = data; >> > + >> > + register_reading(c->wl_display); >> > + >> > + c->display_stopped = 1; >> > + assert(wl_display_read_events(c->wl_display) == 0 >> > + && errno != EAGAIN); >> > + /* if thread was sleeping in read_events, this assert >> > + * must pass now */ >> > + assert(threads_can_be_woken_up && "Thread woke up too early"); >> > + >> > + pthread_exit(NULL); >> > +} >> > + >> > +static void >> > +threading_read_eagain(void) >> > +{ >> > + struct client *c = client_connect(); >> > + pthread_t th1, th2, th3; >> > + struct pollfd fds; >> > + int i; >> > + >> > + register_reading(c->wl_display); >> > + >> > + th1 = create_thread(c, thread_eagain_func); >> > + th2 = create_thread(c, thread_eagain_func); >> > + th3 = create_thread(c, thread_eagain_func); >> > + >> > + /* All the threads are sleeping, waiting until read or cancel >> > + * is called. Since we have no data on socket waiting, >> > + * the wl_connection_read should end up with error and set errno >> > + * to EAGAIN => wl_display_read_events() will return 0 and set >> errno >> > + * to EAGAIN. It should not wake up the sleeping threads, so that >> > + * this thread can try to read again */ >> > + alarm(3); >> > + assert(wl_display_read_events(c->wl_display) == 0); >> > + assert(errno == EAGAIN); >> >> I think there is a problem here. wl_display_read_events() returning 0 >> means success, and in that case errno may not be set if it was a real >> success. There is no EAGAIN to be handled, it's already done inside >> read_events(). >> >> In this test it won't be a real success, but you cannot code clients to >> check for errno after getting 0 from wl_display_read_events(). >> > >> > + >> > + /* the same should happen next times - but there was a bug when >> > + * the main thread got blocked! */ >> > + for (i = 0; i < 5; ++i) { >> > + assert(wl_display_read_events(c->wl_display) == 0); >> >> Therefore you are here calling wl_display_read_events() again after it >> succeeded! And that IMO is a developer error. >> > > yes > > >> Besides, I cannot see anywhere where it says that it is ok to call >> wl_display_read_events() again even if it failed, without calling >> wl_display_prepare_read() first. >> >> > Since original piece of code returned 0 on EAGAIN, then I suppose the > author > considered this as a success, so yes, wl_display_read_events() is not > probably meant > to be called again. But consider this: > if the thread that got EAGAIN (the only thread that is not sleeping) calls > wl_display_prepare_read() > + wl_display_read_events() again, it works. Problem is that it does not > have to call it > right after the failed wl_display_read_events and the other threads are > still sleeping. > Moreover, if there are not events on the socket for some time, this one > thread is looping in cycle > while the other threads are sleeping. > > What I meant to do was to give the programmer the choice whether to loop > or let > all threads try it again (cancel read). But as you said before, 0 is wrong > return value for that. > See end of the e-mail for conclusions. > > >> But I see something else: wl_display_read_events() does not decrement >> reader_count if it returns due to last_error. I think that needs a fix >> to guarantee the reader_count is decremented, so that the side-effects >> are consistent. >> > > Yes, we should fix this. There are cases that can cause a problem. > Actually, I can not make up any use-case where it is problem... do you have any? > > + assert(errno == EAGAIN); >> > + } >> > + >> > + /* generate something to read and wait until the event comes */ >> > + wl_proxy_marshal((struct wl_proxy *) c->tc, >> > + 0 /* STOP DISPLAY */, 1 /* number */); >> > + assert(wl_display_flush(c->wl_display) > 0); >> > + >> > + fds.fd = wl_display_get_fd(c->wl_display); >> > + fds.events = POLLIN; >> > + assert(poll(&fds, 1, 3) == 1); >> >> Is timeout of 3 milliseconds intended? >> > > Nope, my bad. > > >> >> > + >> > + /* threads can be woken up now */ >> > + threads_can_be_woken_up = 1; >> > + >> > + errno = 0; >> > + assert(wl_display_read_events(c->wl_display) == 0); >> > + assert(errno != EAGAIN); >> > + assert(wl_display_dispatch_pending(c->wl_display) == 1); >> > + >> > + pthread_join(th1, NULL); >> > + pthread_join(th2, NULL); >> > + pthread_join(th3, NULL); >> > + >> > + client_disconnect(c); >> > +} >> > + >> > +TEST(threading_read_eagain_tst) >> > +{ >> > + struct display *d = display_create(); >> > + >> > + client_create(d, threading_read_eagain); >> > + >> > + display_run(d); >> > + display_resume(d); >> > + >> > + display_destroy(d); >> > +} >> > + >> > static void * >> > thread_prepare_and_read2(void *data) >> > { >> >> >> Thanks, >> pq >> > > First, thanks for review. > > Summing it up, in my point of view we have these options: > > * Let it as it is: > this can lead to starving of other threads for some time. Moreover, > if the thread that > got EAGAIN decides to exit (or simply not to read again) instead of > another trial to read, the other threads will block. > * return -1 and set EAGAIN > programmer can either try to read again, or cancel read and let all > threads do another cycle. Good thing is > that programmer knows that there was an error and can cancel read in > the case that the thread is exiting > instead of another cycle. > * wake up threads on EAGAIN > this solution will force all threads to do another cycle (or whatever > they want) and should not have any side effects. It can lead to > big overhead if there are no events on the socket for some time > (waking up threads again and again in cycle) > > + fix the bug with last_error, of course > > Is there something that I'm missing? > > Thanks, > Marek >
_______________________________________________ wayland-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/wayland-devel
