Re: test-pthread-rwlock failure on Pop!_OS 22.04 LTS

Bruno Haible Wed, 14 Aug 2024 07:54:03 -0700

Pádraig Brady wrote:
> It failed again the same way with the latest gnulib.
> Note the test is run with `make -j24 check` from crontab.


Yeah, this test is known to fail under load. There is no way to make it
100% reliable under high load. Even if I was to choose a STEP_INTERVAL of
1 second, it would probably fail under a load of 500 or so.

> I did however notice different output depending on
> whether the test was run in the foreground or background

This might hint to effects of how the scheduler works.

> If I run the test in the foreground with 
> ./gnulib-tests/test-pthread-rwlock-waitqueue
> I don't see any of the "... => ..." lines output.

These "... => ..." lines are debugging helps in case some result
is unexpected. But this happens only when the load is high. For example:

    WRR => W1 R2 R3
    WRRR => W1 R2 R3 R4
    WRRRR => R2 W1 R4 R5 R3

means that, after one thread enqueues a request for locking as a writer
(while no reader is present), then after STEP_INTERVAL, another thread
enqueues a request for locking as a reader, the second thread gets the
lock. This is only possible if the kernel has not reacted to the first
request within STEP_INTERVAL.

I'm committing this workaround. Which has the effect of excluding the
test from coreutils (unless you are using --with-longrunning-tests).


2024-08-14  Bruno Haible  <br...@clisp.org>

        pthread-rwlock-extra-tests: Exclude this test from packages by default.
        * tests/test-pthread-rwlock-waitqueue.c (STEP_INTERVAL): Add comment.
        * modules/pthread-rwlock-extra-tests (Status): Mark as longrunning-test.

diff --git a/modules/pthread-rwlock-extra-tests 
b/modules/pthread-rwlock-extra-tests
index c14e3ed8ac..2d93488dfd 100644
--- a/modules/pthread-rwlock-extra-tests
+++ b/modules/pthread-rwlock-extra-tests
@@ -1,3 +1,6 @@
+Status:
+longrunning-test
+
 Files:
 tests/test-pthread-rwlock-waitqueue.c
 tests/macros.h
diff --git a/tests/test-pthread-rwlock-waitqueue.c 
b/tests/test-pthread-rwlock-waitqueue.c
index ad190b5491..6b800ea5cc 100644
--- a/tests/test-pthread-rwlock-waitqueue.c
+++ b/tests/test-pthread-rwlock-waitqueue.c
@@ -48,7 +48,23 @@
    flavours of read-write locks.  */
 
 /* Some platforms need a longer STEP_INTERVAL, otherwise some of the assertions
-   RRR, RRRR, RRRRR fail.  */
+   RRR, RRRR, RRRRR fail.
+   Note: The probability of failing these assertions is higher when the machine
+   is under high load.  It can be worked around by increasing the 
STEP_INTERVAL.
+   However, increasing the STEP_INTERVAL means to increase the total duration
+   of this test:
+     STEP_INTERVAL  Duration (on glibc/Linux)
+         10 ms           29 sec
+         20 ms           57 sec
+         50 ms          2.4 min
+        100 ms          4.8 min
+        200 ms          9.6 min
+   There is no way to have this test be reasonably fast and 100% reliable at 
the
+   same time.  Therefore the compromise we have chosen is
+     - to pick STEP_INTERVAL so that the test succeeds on developer machines
+       with little load and on continuous integration machines,
+     - to exclude the test from packaging, unless the gnulib-tool option
+       '--with-longrunning-tests' is specified.  */
 #if (defined __APPLE__ && defined __MACH__)
 /* macOS */
 # define STEP_INTERVAL 200000000 /* nanoseconds */

Re: test-pthread-rwlock failure on Pop!_OS 22.04 LTS

Reply via email to