On Mon, 11 Dec 2023, Nate Eldredge wrote:

To fix, we need something like `__args._M_old = __val;` inside the loop in __atomic_wait_address(), so that we always wait on the exact value that the predicate __pred() rejected. Again, there are similar instances in atomic_timed_wait.h.

Thinking through this, there's another problem. The main loop in __atomic_wait_address() would then be:

      while (!__pred(__val))
        {
          __args._M_old = __val;
          __detail::__wait_impl(__wait_addr, &__args);
          __val = __vfn();
        }

The idea being that we only call __wait_impl to wait on a value that the predicate said was unacceptable. But looking for instance at its caller __atomic_semaphore::_M_acquire() in bits/semaphore_base.h, the predicate passed in is _S_do_try_acquire(), whose code is:

    _S_do_try_acquire(__detail::__platform_wait_t* __counter,
                      __detail::__platform_wait_t __old) noexcept
    {
      if (__old == 0)
        return false;

      return __atomic_impl::compare_exchange_strong(__counter,
                                                    __old, __old - 1,
                                                    memory_order::acquire,
                                                    memory_order::relaxed);
    }

It returns false if the value passed in was unacceptable (i.e. zero), *or* if it was nonzero (let's say 1) but the compare_exchange failed because another thread swooped in and modified the semaphore counter. In that latter case, __atomic_wait_address() would pass 1 to __wait_impl(), which is likewise bad. If the counter is externally changed back to 1 just before we call __platform_wait (that's the futex call), we would go to sleep waiting on a semaphore that is already available: deadlock.

I guess there's a couple ways to fix it.

You could have the "predicate" callback instead return a tri-state value: "all done, stop waiting" (like current true), "value passed is not acceptable" (like current false), and "value was acceptable but something else went wrong". Only the second case should result in calling __wait_impl(). In the third case, __atomic_wait_address() should just reload the value (using __vfn()) and loop again.

Or, make the callback __pred() a pure predicate that only tests its input value for acceptability, without any other side effects. Then have __atomic_wait_address() simply return as soon as __pred(__val) returns true. It would be up to the caller to actually decrement the semaphore or whatever, and to call __atomic_wait_address() again if this fails. In that case, __atomic_wait_address should probably return the final value that was read, so the caller can immediately do something like a compare-exchange using it, and not have to do an additional load and predicate test.

Or, make __pred() a pure predicate as before, and give __atomic_wait_address yet one more callback function argument, call it __taker(), whose job is to acquire the semaphore etc, and have __atomic_wait_address call it after __pred(__val) returns true.

--
Nate Eldredge
n...@thatsmathematics.com

Reply via email to