Re: bugs in test-lock

Bruno Haible Thu, 05 Jan 2017 10:43:44 -0800

Torvald Riegel wrote [in private email, should have CCed bug-gnulib]:

[about the use of a lock in 'struct atomic_int' in test-lock.c:]


> > The delivered code, with a lock, is correct, however, right?
> 
> Not quite, unfortunately, because locks do not make any guarantees
> regarding fairness.  Thus, there is no guarantee that critical sections
> querying the flag do not starve a critical sections that wants to set
> the flag.
> Good implementations will try to avoid starvation to some extent, but
> that's not simple without potentially decreasing overall throughput.
> Thus, in practice, you can't expect to never observe starvation.

Good point. Yes, if I have multiple threads executing a loop

  for (;;)
    {
      pthread_mutex_lock (&lock);
      do something;
      pthread_mutex_unlock (&lock);
      do very little other things;
    }

it is very possible that a specific thread that also wants this lock
will have to wait way too long.

That's why I inserted the yield statements in the lock_checker_thread:

  for (;;)
    {
      pthread_mutex_lock (&lock);
      do something;
      pthread_mutex_unlock (&lock);
      do very little other things;
      pthread_yield ();
    }

It should have (and in my experiments, actually has) the effect that each
waiting thread has an average chance to get the lock -- this relies on a
fair scheduler in the kernel, of course -- and thus to limit the execution
time of test-lock.

EXPLICIT_YIELD = 0  ->  3.2 ... 3.5 sec  (especially slow in test_once).
EXPLICIT_YIELD = 1  ->  1.8 sec

> POSIX makes no explicit guarantee regarding forward progress (ie, things
> like guaranteeing no starvation), but the specification gives indication
> that concurrent sem_trywait or sem_wait should not starve a sem_post
> (IOW, simplified, sem_post should complete eventually).  I'd also say
> that good-quality implementaitons of semaphores would not starve
> sem_post, simply because you'd want to implement sem_trywait as just
> atomic reads internally, and those won't starve concurrent atomic writes
> (as is, for example, explicitly required in the C11/C++11 memory model).
> 
> Therefore, I'd suggest using semaphores for the completion flags.

What I actually need here, in lock_checker_done, is a notification mechanism
from the main thread to the checker threads.
I used 'volatile' - turned out to be very slow.
Now I use a pthread_mutex_t, which is a bit of an abuse.
sem_t may well be better than pthread_mutex_t, but is still a bit of an abuse:
semaphores are about taking and releasing access to a resource, without the
notion of "owner".
For a real notification mechanism, probably a pthread_cond_t would be the right
thing.

The problem here is: it's a unit test. If it fails, I don't want to search
for bugs in the condition variable subsystem, semaphore subsystem, AND the
lock subsystem. I want the test to rely essentially only on the lock subsystem.

> Second, the mutex test is incorrect because the checker thread's
> critical sections are allowed to starve the mutator threads' critical
> sections.  I'd suggest to either use a fixed number of iterations or a
> fixed duration for both mutators and checkers, or to move the setting of
> the completion flag to before the pthread_join for the mutators.

I think the yield () calls handle this danger; see above.

> Third, the rwlock test is also incorrect because it is, in the general
> case, implementation-defined whether readers or writers take preference.
> I have explained why here:
> https://bugzilla.redhat.com/show_bug.cgi?id=1410052
> Fixing this requires either similar steps as for the exclusive/recursive
> mutex tests, or using something like
> PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP.

This was being discussed in the other thread. The glthread_rwlock_* functions
now give preference to the writer - on all platforms, not only on some as it
was before (Solaris, Mac OS X did it right already).

Bruno

Re: bugs in test-lock

Reply via email to