https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99572
Bug ID: 99572 Summary: std::counting_semaphore coalescing wakes Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: michaelkuc6 at gmail dot com Target Milestone: --- Created attachment 50379 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50379&action=edit Non-waking semaphores source code When repeatedly calling `std::counting_semaphore::release()`, only a single thread is woken, unless the releasing thread is made to sleep. A minimal example of the error is at https://github.com/Crystalix007/GCC-Semaphores, or as the attached file (compiled with `g++ --std=c++20 -o Sem -lpthread source.cpp`). Inspecting with GDB myself, the semaphore release function calls `std::__atomic_semaphore<int>::_M_release()`, implemented https://github.com/gcc-mirror/gcc/blob/5987d8a79cda1069c774e5c302d5597310270026/libstdc%2B%2B-v3/include/bits/semaphore_base.h#L270. Notably the semaphore release only wakes a single thread, which would be OK if a different thread were woken each time. Eventually the call chain ends up performing a futex syscall https://github.com/gcc-mirror/gcc/blob/5987d8a79cda1069c774e5c302d5597310270026/libstdc%2B%2B-v3/include/bits/atomic_wait.h#L112. As far as I can tell, this doesn't correctly handle repeated wake-ups. Testing against the POSIX semaphore implementation, GCC's implementation seems to wake up only a single thread (i.e. coalescing all wakeup events into one). By sleeping the thread between semaphore releases, the program can actually finish. If the delay is too short, sometimes only some of the threads are woken, but with a sufficient delay, all threads are woken correctly. Unless I'm misunderstanding, is it possible that the woken thread is not marked as woken, or the futex wake just doesn't care that the woken thread is already awake? GCC Version (compiled from commit a18ebd6c439): Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/11.0.1/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /home/michaelkuc6/.cache/yay/gcc-git/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --with-isl --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror gdc_include_dir=/usr/include/dlang/gdc Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.0.1 20210307 (experimental) (GCC) System Type: Linux (x86_64) (specifically Arch Linux kernel 5.11.2)