https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104928
--- Comment #19 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jonathan Wakely <r...@gcc.gnu.org>: https://gcc.gnu.org/g:7be4913b8f8b1e1474751656d45b301aa0c20790 commit r16-1102-g7be4913b8f8b1e1474751656d45b301aa0c20790 Author: Jonathan Wakely <jwak...@redhat.com> Date: Mon Jun 2 13:06:27 2025 +0100 libstdc++: Fix std::counting_semaphore::acquire deadlock [PR104928] There's a deadlock in std::counting_semaphore that occurs when the semaphore is under contention. The bug happens when one thread tries to acquire the mutex, calling __semaphore_base::_S_do_try_acquire to atomically decrement the counter using compare_exchange_strong. If the counter is non-zero (and so should be possible to decrement) but another thread changes it (either incrementing or decrementing it) then the compare_exchange fails and _S_do_try_acquire returns false. Because that function is used by the predicate passed to __atomic_wait_address, when it returns false the thread does a futex wait until the value changes. However, when the predicate is false because the compare_exchange failed due to not matching the expected value, waiting for the value to change is incorrect. The correct behaviour would be to retry the compare_exchange using the new value (as long as it's still non-zero). Waiting for the value to change again means we can block forever, because it might never change again. The predicate should only test the value, not also attempt to alter it, and its return value should mean only one thing, not conflate a busy semaphore that cannot be acquired with a contended one that can be acquired by retrying. The correct behaviour of __semaphore_base::_M_acquire would be to attempt the decrement, and to retry immediately if it failed due to contention on the variable (i.e. due to the variable not having the expected value). It should only wait for the value to change when the value is zero, because that's the only time we can't decrement it. This commit moves the _S_do_try_acquire call out of the predicate and loops while it is false, only doing an atomic wait when the counter's value is zero. The predicate used for the atomic wait now only checks whether the value is decrementable (non-zero), without also trying to perform that decrement. In order for the caller to tell whether it should retry a failed _S_do_try_acquire or should wait for the value to be non-zero, the value obtained by a failed compare_exchange needs to be passed back to the caller. _S_do_try_acquire is changed to take its parameter by reference, so that the caller gets the new value and can check whether it's zero. In order to avoid doing another atomic load after returning from an atomic wait, the predicate is also changed to capture the local __val by reference, and then assign to __val when it sees a non-zero value. That makes the new value available to _M_acquire, so it can be passed to _S_do_try_acquire as the expected value of the compare_exchange. Although this means that the predicate is modifying data again, not just checking a value, this modification is safe. It's not changing the semaphore's counter, only changing a local variable in the caller to avoid a redundant atomic load. Equivalent changes are made to _M_try_acquire_until and _M_try_acquire_for. They have the same bug, although they can escape the deadlock if the wait is interrupted by timing out. For _M_acquire there's no time out so it potentially waits forever. _M_try_acquire also has the same bug, but can be simplified to just calling _M_try_acquire_for(0ns). A timeout of zero results in calling __wait_impl with the __spin_only flag set, so that the value is loaded and checked in a spin loop but there is no futex wait. This means that _M_try_acquire can still succeed under light contention if the counter is being changed concurrently, at the cost of a little extra overhead. It would be possible to implement _M_try_acquire as nothing more than an atomic load and a compare_exchange, but it would fail when there is any contention. libstdc++-v3/ChangeLog: PR libstdc++/104928 * include/bits/semaphore_base.h (_S_do_try_acquire): Take old value by reference. (_M_acquire): Move _S_do_try_acquire call out of the predicate and loop on its result. Make the predicate capture and update the local copy of the value. (_M_try_acquire_until, _M_try_acquire_for): Likewise. (_M_try_acquire): Just call _M_try_acquire_for. * testsuite/30_threads/semaphore/104928-2.cc: New test. * testsuite/30_threads/semaphore/104928.cc: New test. Reviewed-by: Tomasz KamiÅski <tkami...@redhat.com>