Rich Felker wrote:
> No, I'm saying that it's impossible to make a conforming
> implementation that prefers writers *in general*, and that, at least
> as you've described them, the above two implementstions "prefer
> readers" most of the time in the heavily contended case with multiple
> readers (as in the test).

I agree with that.

Before Natanael's measurements yesterday, it was my understanding, from [1],
that
  * An implementation of rwlocks that always prefers writers will have
    readers starve and thus not be POSIX comforming,
  * An implementation of rwlocks that always prefers readers will have
    writers starve,
  * and that therefore to avoid both kinds of starvation, some middle-ground
    is needed.

I thought that musl's behaviour

  When releasing the last reader lock:
    If at least one of the enqueued lock attempts is for reading, the
    first one of them is granted.
    Otherwise, the first of the waiting write attempts is granted.
  ...
  This implementation does not globally prefer readers, only when releasing
  a reader lock.

would fit that "middle-ground" requirement.

But as Natanael's measurements show, this is not the case.

> They can only "prefer writers" in the relatively rare case where
> there's a single lock owner at unlock time. Thus they still have the
> same fundamental starvation issue, just covered up so that you're less
> likely to hit it rather than having it be obvious right away so that
> you have to make a proper, reliable fix.
> 
> If you think I'm mistaken and know of a way to make it so a writer is
> guaranteed to always get the lock even in the presence of hammering by
> readers, I'd be interested to hear about it.

I think you are right about this fundamental issue.

Did some more measurement, with musl libc and test-pthread-rwlock.c modified:
#define THREAD_COUNT 4
#define REPEAT_COUNT 100000
Timing on a machine with 8 CPUs (sufficient for the 2*THREAD_COUNT threads):
real    0m0.23s
user    0m0.62s
sys     0m1.10s
Whereas with
#define THREAD_COUNT 10
#define REPEAT_COUNT 50000
on a machine with 64 CPUs, Natanael had observed
real    30s or > 600s, depending on the scheduling policy.

So the execution time appears to grow exponentially with the number of threads
(assuming a large enough number of CPUs, so that the "hammering" of the readers
is maximally effective).

I bet that with THREAD_COUNT = 20 or 30, on a machine with 64 CPUs, the writer
starvation will be even more extreme.

So, the problem is the behaviour during pthread_rwlock_rdlock, when (*)
  - the lock is already held by some readers, and
  - some writer is blocked on the lock.
[1] has the wording about this situation:
  "It would be unfair for R2 to jump in immediately, ahead of W;
   if that happened often enough, W would starve."

[1], section "Third readers–writers problem", purports to give a solution that
is fair on both sides. But I don't understand it.

Another solution for the problem is in case (*), to not let the reader
acquire the lock, with some probability > 0. The probabilities need to be
worked out in such a way that, in the end, the blocked writer actually has
a chance to acquire the lock. AFAICS, this would be POSIX-compliant: [2]
says "If the Thread Execution Scheduling option is not supported, it is
implementation-defined whether the calling thread acquires the lock when
a writer does not hold the lock and there are writers blocked on the lock."

Yet another solution is in case (*), to do what POSIX [2] describes as
the behaviour for the SCHED_FIFO, SCHED_RR, SCHED_SPORADIC scheduling
policies, assuming all threads have the same priority:
"... the calling thread shall not acquire the lock if
  a writer holds the lock
  or if the calling thread does not already hold a read lock
        and writers [...] are blocked on the lock;
 otherwise, the calling thread shall acquire the lock."

So, IMO, that would be an improvement that musl libc could do, that
would fix the writer starvation even with the default scheduling policy.

Bruno

[1] https://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem
[2] 
https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/pthread_rwlock_rdlock.html




Reply via email to