Pádraig Brady wrote: > It failed again the same way with the latest gnulib. > Note the test is run with `make -j24 check` from crontab.
Yeah, this test is known to fail under load. There is no way to make it 100% reliable under high load. Even if I was to choose a STEP_INTERVAL of 1 second, it would probably fail under a load of 500 or so. > I did however notice different output depending on > whether the test was run in the foreground or background This might hint to effects of how the scheduler works. > If I run the test in the foreground with > ./gnulib-tests/test-pthread-rwlock-waitqueue > I don't see any of the "... => ..." lines output. These "... => ..." lines are debugging helps in case some result is unexpected. But this happens only when the load is high. For example: WRR => W1 R2 R3 WRRR => W1 R2 R3 R4 WRRRR => R2 W1 R4 R5 R3 means that, after one thread enqueues a request for locking as a writer (while no reader is present), then after STEP_INTERVAL, another thread enqueues a request for locking as a reader, the second thread gets the lock. This is only possible if the kernel has not reacted to the first request within STEP_INTERVAL. I'm committing this workaround. Which has the effect of excluding the test from coreutils (unless you are using --with-longrunning-tests). 2024-08-14 Bruno Haible <br...@clisp.org> pthread-rwlock-extra-tests: Exclude this test from packages by default. * tests/test-pthread-rwlock-waitqueue.c (STEP_INTERVAL): Add comment. * modules/pthread-rwlock-extra-tests (Status): Mark as longrunning-test. diff --git a/modules/pthread-rwlock-extra-tests b/modules/pthread-rwlock-extra-tests index c14e3ed8ac..2d93488dfd 100644 --- a/modules/pthread-rwlock-extra-tests +++ b/modules/pthread-rwlock-extra-tests @@ -1,3 +1,6 @@ +Status: +longrunning-test + Files: tests/test-pthread-rwlock-waitqueue.c tests/macros.h diff --git a/tests/test-pthread-rwlock-waitqueue.c b/tests/test-pthread-rwlock-waitqueue.c index ad190b5491..6b800ea5cc 100644 --- a/tests/test-pthread-rwlock-waitqueue.c +++ b/tests/test-pthread-rwlock-waitqueue.c @@ -48,7 +48,23 @@ flavours of read-write locks. */ /* Some platforms need a longer STEP_INTERVAL, otherwise some of the assertions - RRR, RRRR, RRRRR fail. */ + RRR, RRRR, RRRRR fail. + Note: The probability of failing these assertions is higher when the machine + is under high load. It can be worked around by increasing the STEP_INTERVAL. + However, increasing the STEP_INTERVAL means to increase the total duration + of this test: + STEP_INTERVAL Duration (on glibc/Linux) + 10 ms 29 sec + 20 ms 57 sec + 50 ms 2.4 min + 100 ms 4.8 min + 200 ms 9.6 min + There is no way to have this test be reasonably fast and 100% reliable at the + same time. Therefore the compromise we have chosen is + - to pick STEP_INTERVAL so that the test succeeds on developer machines + with little load and on continuous integration machines, + - to exclude the test from packaging, unless the gnulib-tool option + '--with-longrunning-tests' is specified. */ #if (defined __APPLE__ && defined __MACH__) /* macOS */ # define STEP_INTERVAL 200000000 /* nanoseconds */