Paul Eggert wrote: > > 1. Install a toolchain for creating binaries linked against musl-libc: > > $ sudo apt install musl-gcc > > 2. Create a testdir for the module 'pthread-rwlock'. > > 3. Build it with CC="gcc". You should still see the test failure after > > 10 minutes. > > 4. Build it with CC="musl-gcc". Run > > $ time gltests/test-pthread-rwlock > > > > If 4. takes more than 30 seconds, your OS is probably running on top of > > a hypervisor, and that is causing the problem. > > > > If 4. takes just a few seconds, it looks like a glibc bug. > > 4. takes just a few seconds on my platform.
Thanks. So, it looks like a glibc bug. I have investigated this from three different angles. 1) In maint-tools/ I have added a directory test-programs/pthread-rwlock/ with a test program that analyzes how an implementation of POSIX rwlocks handles the wait queue. This is based on the same idea as m4/pthread_rwlock_rdlock.m4, but where that extracts only a single bit of information about one particular situation, the new test program exercises many situations, analyzes them, and provides a textual summary. Find here attached the results of this analysis. glibc ≥ 2.25 with the default pthread_rwlock_t initializer is particular: It is the *only* implementation for which the result contains the statement "This implementation always prefers readers." 2) I have redone the timings of most platforms, each in a VirtualBox VM with paravirtualization set to "minimal" (so as to avoid the bug mentioned in <https://lists.gnu.org/archive/html/bug-gnulib/2024-06/msg00318.html>). Find here attached the timings. While there is sometimes a big difference between the columns "1 CPU" and "2 CPUs", it can be attributed to single-thread optimizations. Therefore, what we need to look at are the columns "2 CPUs", "4 CPUs", "8 CPUs". While some platforms (Cygwin, Solaris 11, gnulib on mingw) show a moderate increase of the times in this progression, the result is: glibc ≥ 2.25 with the default pthread_rwlock_t initializer is particular: It is the *only* implementation for which the times in this progression grow by a factor of more than 10. So far, there is a correlation between the analysis results and the timings. Is this correlation a causal relation? 3) To investigate this question, I modified the gnulib implementation on mingw to prefer readers. Namely, in lib/windows-rwlock.c, lib/windows-timedrwlock.c in function glwthread_[timed]rwlock_unlock I changed the line if (lock->waiting_writers.count > 0) to if (lock->waiting_writers.count > 0 && lock->waiting_readers.count == 0) As expected, the analysis results are: "This implementation always prefers readers." And the timings are: gnulib on mingw, modified to prefer readers 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.787 2.962 385.999 > 7200 That is, a steep increase in running time from 2 CPUs to 8 CPUs. This provides enough justification for gnulib to override the glibc behaviour here. I will therefore go ahead and override PTHREAD_RWLOCK_INITIALIZER pthread_rwlock_init pthread_rwlock_attr_init to use PREFER_WRITER or PREFER_WRITER_NONRECURSIVE, which both behave in a sane way even in glibc ≥ 2.25. We knew about the problem since 2018 <https://lists.gnu.org/archive/html/bug-gnulib/2018-06/msg00026.html>, but had not realized the impact that it has on systems with 8 or more CPUs. Thanks for your report, Paul! Bruno
How do the implementations work? glibc/Linux <2.25 default, glibc/Linux PREFER_WRITER, AIX, Android When releasing the last reader lock: If at least one of the enqueued lock attempts is for reading, the first one of them is granted. Otherwise, the first of the waiting write attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the first one of them is granted. Otherwise, one of the waiting read attempts is granted. This implementation does not globally prefer readers, only when releasing a reader lock. This implementation does not globally prefer writers, only when releasing a writer lock. glibc/Linux PREFER_WRITER_NONRECURSIVE, FreeBSD, Cygwin When releasing the last reader lock: The first of the enqueued lock attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the first one of them is granted. Otherwise, one of the waiting read attempts is granted. This implementation does not prefer readers. This implementation does not globally prefer writers, only when releasing a writer lock. glibc/Linux ≥2.25 default When releasing the last reader lock: If at least one of the enqueued lock attempts is for reading, the first one of them is granted. Otherwise, the first of the waiting write attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for reading, one of them is granted. Otherwise, the first of the waiting write attempts is granted. This implementation always prefers readers. This implementation does not prefer writers. glibc/Hurd When releasing the last reader lock: If at least one of the enqueued lock attempts is for reading, the first one of them is granted. Otherwise, the latest (LIFO!) waiting write attempt is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the latest (LIFO!) one of them is granted. Otherwise, the latest (LIFO!) of the waiting read attempts is granted. This implementation does not globally prefer readers, only when releasing a reader lock. This implementation does not globally prefer writers, only when releasing a writer lock. This implementation is deterministic. musl, OpenBSD When releasing the last reader lock: If at least one of the enqueued lock attempts is for reading, the first one of them is granted. Otherwise, the first of the waiting write attempts is granted. When releasing a writer lock: The first of the enqueued lock attempts is granted. This implementation does not globally prefer readers, only when releasing a reader lock. This implementation does not prefer writers. This implementation is deterministic. macOS When releasing the last reader lock: The first of the enqueued lock attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the first of them is granted. Otherwise, one of the waiting read attempts is granted. This implementation does not prefer readers. This implementation does not prefer writers. NetBSD When releasing the last reader lock: The first of the enqueued lock attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the first one of them is granted. Otherwise, the latest (LIFO!) of the waiting read attempts is granted. This implementation does not prefer readers. This implementation does not globally prefer writers, only when releasing a writer lock. This implementation is deterministic. Solaris 10 When releasing the last reader lock: If at least one of the enqueued lock attempts is for writing, the latest (LIFO!) one of them is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, one of the waiting write attempts is granted (not necessarily the first one). Otherwise, one of the waiting read attempts is granted. This implementation does not prefer readers. This implementation always prefers writers. Solaris 11 When releasing the last reader lock: If at least one of the enqueued lock attempts is for writing, one of them is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, one of the waiting write attempts is granted (not necessarily the first one). Otherwise, one of the waiting read attempts is granted. This implementation does not prefer readers. This implementation always prefers writers. gnulib on native Windows When releasing the last reader lock: The first of the enqueued lock attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for writing, the first one of them is granted. Otherwise, the first of the waiting read attempts is granted. This implementation does not prefer readers. This implementation does not globally prefer writers, only when releasing a writer lock. This implementation is deterministic. gnulib on native Windows, modified to prefer readers When releasing the last reader lock: If at least one of the enqueued lock attempts is for reading, the first one of them is granted. Otherwise, the first of the waiting write attempts is granted. When releasing a writer lock: If at least one of the enqueued lock attempts is for reading, the first of them is granted. Otherwise, the first of the waiting write attempts is granted. This implementation always prefers readers. This implementation does not prefer writers. This implementation is deterministic.
Timings of test-pthread-rwlock, created from ./gnulib-tool --create-testdir --dir=../testdir1 --single-configure pthread-rwlock in a VirtualBox VM (minimal paravirtualization) with 1, 2, 4, 8 CPUs. Each value is the average of the real time of 3 runs of "time ./test-pthread-rwlock", in seconds. glibc/Linux <2.25 default (Debian 9.6) 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.712 0.884 0.589 0.970 glibc/Linux PREFER_WRITER (Ubuntu 22.04) 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.983 2.009 1.238 1.683 glibc/Linux PREFER_WRITER_NONRECURSIVE (Ubuntu 22.04) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.022 2.289 1.141 0.868 FreeBSD (FreeBSD 14.0) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.62 1.40 1.14 1.33 Cygwin (Cygwin 2.9.0) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.094 18.683 36.945 43.999 glibc/Linux ≥2.25 default (Ubuntu 22.04) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.032 1.998 3.515 67.596 glibc/Hurd 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.433 0.520 0.713 0.630 musl (Alpine Linux 3.20) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.633 2.413 1.390 0.843 OpenBSD (OpenBSD 7.5) 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.833 2.837 2.757 2.853 NetBSD (NetBSD 10.0) 1 CPU 2 CPUs 4 CPUs 8 CPUs 6.600 7.297 6.993 7.967 Solaris 11 (Solaris 11.4) 1 CPU 2 CPUs 4 CPUs 8 CPUs 1.997 19.697 22.924 28.039 gnulib on mingw 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.763 9.769 15.542 15.954 gnulib on mingw, modified to prefer readers 1 CPU 2 CPUs 4 CPUs 8 CPUs 0.787 2.962 385.999 > 7200