[Adding Rich Felker to CC] On Tue, 13 May 2025 18:05:50 +0200 Bruno Haible <br...@clisp.org> wrote:
> Natanael Copa wrote: > > > So, you could try to install a different scheduler by default and repeat > > > the test. > > > > It passed with chrt --fifo (I had to do it from outside the LXC container): > > > > # time chrt --fifo 10 ./test-pthread-rwlock > > Starting test_rwlock ... OK > > real 0m 33.00s > > user 6m 50.63s > > sys 0m 16.23s > > > > I also verified that it still times out from outside the LXC container with > > the default: > > > > # time ./test-pthread-rwlock > > Starting test_rwlock ...Command terminated by signal 14 > > real 10m 0.01s > > user 1h 46m 24s > > sys 2m 59.39s > > > > > > # time chrt --rr 10 ./test-pthread-rwlock > > Starting test_rwlock ... OK > > real 0m 30.00s > > user 6m 2.07s > > sys 0m 19.16s > > > > # time chrt --rr 99 ./test-pthread-rwlock > > Starting test_rwlock ... OK > > real 0m 30.00s > > user 6m 9.40s > > sys 0m 13.37s > > > > So even if the CPU cores are slow, they appear to finish in ~30 sec. > > > > chrt --other and chrt --idle appears to trigger the deadlock. > > For comparison, some other data (below the details): > > * On x86_64 (glibc), I see essentially no influence of the scheduling > policy on 'time ./test-pthread-rwlock'. > > * On x86_64 (Alpine Linux), the test performs about 25% faster > under SCHED_FIFO and SCHED_RR. > > * On three other riscv64 systems, the test needs less than 4 seconds > real time. Even on my QEMU-emulated riscv64 VM, it needs less > than 4 seconds. > > So, it seems that > 1) Your riscv64 system is generally slower that the cfarm* ones. > 2) The performance penalty of SCHED_OTHER compared to SCHED_FIFO and > SCHED_RR exists also on x86_64, but not to such an extreme extent. > > AFAICS, there are three differences in your setup compared to what I > see in stock Ubuntu: > - Linux is of a PREEMPTY_DYNAMIC flavour. > - musl libc. > - the LXR container. Note that there are 64 CPU cores. I have only tested with that many cores on aarch64. I don't think LXC container should matter, nor should apps deadlock when running on PREEMPTY_DYNAMIC. I'm not sure what the difference is in codepaths compared to GNU libc. I also don't get a timeout on an hifive premier p550 system: alpine-p550:~/aports/main/gettext/src/gettext-0.24.1/gettext-tools/gnulib-tests$ time ./test-pthread-rwlock Starting test_rwlock ... OK real 0m 1.02s user 0m 1.24s sys 0m 2.80s This has only 4 CPU cores though. Each core is faster than the cores on sophgo system, but I don't think it is more than 2x faster per core. $ uname -a Linux alpine-p550 6.6.67-0-p550 #1-Alpine SMP PREEMPT_DYNAMIC 2024-12-28 06:23:29 riscv64 GNU/Linux alpine-p550:~/aports/main/gettext/src/gettext-0.24.1/gettext-tools/gnulib-tests$ cat /etc/os-release NAME="Alpine Linux" ID=alpine VERSION_ID=3.22.0_alpha20250108 PRETTY_NAME="Alpine Linux edge" HOME_URL="https://alpinelinux.org/" BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues" And on banana pi f3 (with 8 cores): alpine-bpi-f3:/var/home/ncopa/aports/main/gettext/src/gettext-0.24.1/gettext-tools/gnulib-tests$ lscpu Architecture: riscv64 Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: 0x710 Model name: Spacemit(R) X60 CPU family: 0x8000000058000001 Model: 0x1000000049772200 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 CPU(s) scaling MHz: 100% CPU max MHz: 1600.0000 CPU min MHz: 614.4000 Caches (sum of all): L1d: 256 KiB (8 instances) L1i: 256 KiB (8 instances) L2: 1 MiB (2 instances) alpine-bpi-f3:/var/home/ncopa/aports/main/gettext/src/gettext-0.24.1/gettext-tools/gnulib-tests$ time ./test-pthread-rwlock Starting test_rwlock ... OK real 0m 5.36s user 0m 17.25s sys 0m 17.60s I suspect the deadlock happens when - musl libc systems - more than 10 cores(?) - CPU cores are slow(?) Not sure the exact codepath it takes on GNU libc systems. Is it the same as with musl libc? > Note: Most Gnulib applications don't use pthread_rwlock directly, but > the glthread_rwlock facility. On musl systems, it works around the > possible writer starvation by reimplementing read-write locks based > on condition variables. This may be slower for a single operation, > but it is guaranteed to avoid writer starvation and therefore is > preferrable globally. This is why you don't see a timeout in > './test-lock', only in './test-pthread-rwlock'. Wait a second. The test does not run the gnulib locking? It just tests the system (musl libc) pthread rwlock, while the app (gettext) would use the gnulib implementation? I though the test verified that production code (gettext in this case) works as intended. Does this test expose a deadlock that could happen in gettext in production? I'm confused. -nc > > Bruno > > ========================= DETAILS ======================= > > Ubuntu x86_64 (glibc): > > # time ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,161s > user 0m0,268s > sys 0m1,047s > > # time chrt --other 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,166s > user 0m0,243s > sys 0m1,046s > # time chrt --fifo 10 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,161s > user 0m0,249s > sys 0m1,080s > # time chrt --rr 10 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,164s > user 0m0,307s > sys 0m1,019s > # time chrt --batch 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,151s > user 0m0,217s > sys 0m1,024s > # time chrt --idle 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m0,195s > user 0m0,264s > sys 0m1,115s > # time chrt --deadline 0 ./test-pthread-rwlock > chrt: failed to set pid 0's policy: Invalid argument > > > Alpine Linux 3.20 x86_64 in a VM: > > > # time ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.54 s > > # time chrt --other 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.56 s > # time chrt --fifo 10 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.24 s > # time chrt --rr 10 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.25 s > # time chrt --batch 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.59 s > # time chrt --idle 0 ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 1.59 s > # time chrt --deadline 0 ./test-pthread-rwlock > chrt: failed to set pid 0's policy: Invalid argument > > > cfarm91 (glibc): > > $ time ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m3,908s > user 0m0,909s > sys 0m6,863s > > > cfarm94 (Alpine Linux): > > $ time ./test-pthread-rwlock > Starting test_rwlock ... OK > real 0m 0.84s > user 0m 0.60s > sys 0m 2.77s > > > cfarm95 (glibc): > > $ time ./test-pthread-rwlock > Starting test_rwlock ... OK > > real 0m2,166s > user 0m9,287s > sys 0m3,145s > > >