Re: Test-lock hang (not 100% reproducible) on GNU/Linux

Pavel Raiskup Mon, 02 Jan 2017 08:38:05 -0800

On Monday, January 2, 2017 4:50:28 PM CET Bruno Haible wrote:
> Hi Pavel,
> 
> > One thing I'm afraid of is that writers could finish too
> > early.  Could we could artificially slow them down?
> 
> In test_rwlock the test does this:
> 
>   /* Wait for the threads to terminate.  */
>   for (i = 0; i < THREAD_COUNT; i++)
>     gl_thread_join (threads[i], NULL);
>   set_atomic_int_value (&rwlock_checker_done, 1);
>   for (i = 0; i < THREAD_COUNT; i++)
>     gl_thread_join (checkerthreads[i], NULL);
> 
> It waits until all 10 mutator threads are terminated, then sets a
> lock-protected variable rwlock_checker_done to 1, that signals to the
> 10 checker thread that they can terminate at the next occasion, and
> then waits for them to terminate.
> 
> Are you saying that the kernel will schedule the 10 checker threads
> with higher priority than the 10 mutator threads, although I have *not*
> specified anything about priorities? That would be a kernel bug, IMO.


That's what I'm not sure about, as discussed in [1], POSIX says (for
pthread_rwlock_wrlock()):

    Implementations may favor writers over readers to avoid writer starvation.

But that's too far from 'shall favor' spelling.  And when I had a look at my man
pthread_rwlockattr_setkind_np(3), there's written:

       PTHREAD_RWLOCK_PREFER_READER_NP
              This  is  the  default.   A thread may hold multiple read locks;
              that is, read locks are recursive.  According to The Single Unix
              Specification,  the  behavior is unspecified when a reader tries
              to place a lock, and there is no  write  lock  but  writers  are
              waiting.   Giving  preference  to  the  reader,  as  is  set  by
              PTHREAD_RWLOCK_PREFER_READER_NP, implies that  the  reader  will
              receive  the  requested  lock,  even if a writer is waiting.  As
              long as there are readers, the writer will be starved.

> Especially since the problem occurs only on one architecture.

I've been able to reproduce this on i686 in the meantime too, sorry -- I just
reported what I observed :(.  See [1].

> > Could we set PTHREAD_RWLOCK_PREFER_WRITER_NP (in test-lock.c) to avoid
> > those issues?
> 
> I disagree. The test is a minimal test of the kernel's multithreading
> support. If among 10 mutator threads and 10 checker threads, all started
> with the same priority, it has such a severe bias that the mutator threads
> never get to run, you have a kernel bug. I should not need a non-portable
> threading function in order to get 20 threads to run reasonably.
> 
> Imagine what scenarios you would then get with an application server and
> 400 threads.

It might be bug in libpthread, too, but based on the POSIX specs and manual
pages, I am not sure whether this might be actually considered a bug.

[1] 
https://lists.fedoraproject.org/archives/list/de...@lists.fedoraproject.org/thread/PQD576JZLERFY6ROI3GF7UYXKZIRI33G/

Pavel

Re: Test-lock hang (not 100% reproducible) on GNU/Linux

Reply via email to