On Fri, 10 Jan 2025 at 21:29, Jonathan Wakely wrote:
>
> This represents a major refactoring of the previous atomic::wait
> and atomic::notify implementation detail. The aim of this change
> is to simplify the implementation details and position the resulting
> implementation so that much of the current header-only detail
> can be moved into the shared library, while also accounting for
> anticipated changes to wait/notify functionality for C++26.
>
> The previous implementation implemented spin logic in terms of
> the types __default_spin_policy, __timed_backoff_spin_policy, and
> the free function __atomic_spin. These are replaced in favor of
> two new free functions; __spin_impl and __spin_until_impl. These
> currently inline free functions are expected to be moved into the
> libstdc++ shared library in a future commit.
>
> The previous implementation derived untimed and timed wait
> implementation detail from __detail::__waiter_pool_base. This
> is-a relationship is removed in the new version and the previous
> implementation detail is renamed to reflect this change. The
> static _S_for member has been renamed as well to indicate that it
> returns the __waiter_pool_impl entry in the static 'side table'
> for a given awaited address.
>
> This new implementation replaces all of the non-templated waiting
> detail of __waiter_base, __waiter_pool, __waiter, __enters_wait, and
> __bare_wait with the __wait_impl free function, and the supporting
> __wait_flags enum and __wait_args struct. This currenly inline free
> function is expected to be moved into the libstdc++ shared library
> in a future commit.
>
> This new implementation replaces all of the non-templated notifying
> detail of __waiter_base, __waiter_pool, and __waiter with the
> __notify_impl free function. This currently inline free function
> is expected to be moved into the libstdc++ shared library in a
> future commit.
>
> The __atomic_wait_address template function is updated to account
> for the above changes and to support the expected C++26 change to
> pass the most recent observed value to the caller supplied predicate.
>
> A new non-templated __atomic_wait_address_v free function is added
> that only works for atomic types that operate only on __platform_wait_t
> and requires the caller to supply a memory order. This is intended
> to be the simplest code path for such types.
>
> The __atomic_wait_address_v template function is now implemented in
> terms of new __atomic_wait_address template and continues to accept
> a user supplied "value function" to retrieve the current value of
> the atomic.
>
> The __atomic_notify_address template function is updated to account
> for the above changes.
>
> The template __platform_wait_until_impl is renamed to
> __wait_clock_t. The previous __platform_wait_until template is deleted
> and the functionality previously provided is moved t the new tempalate
> function __wait_until. A similar change is made to the
> __cond_wait_until_impl/__cond_wait_until implementation.
>
> This new implementation similarly replaces all of the non-templated
> waiting detail of __timed_waiter_pool, __timed_waiter, etc. with
> the new __wait_until_impl free function. This currently inline free
> function is expected to be moved into the libstdc++ shared library
> in a future commit.
>
> This implementation replaces all templated waiting functions that
> manage clock conversion as well as relative waiting (wait_for) with
> the new template functions __wait_until and __wait_for.
>
> Similarly the previous implementation detail for the various
> __atomic_wait_address_Xxx templates is adjusted to account for the
> implementation changes outlined above.
>
> All of the "bare wait" versions of __atomic_wait_Xxx have been removed
> and replaced with a defaulted boolean __bare_wait parameter on the
> new version of these templates.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/atomic_timed_wait.h:
> (__detail::__platform_wait_until_impl): Rename to
> __platform_wait_until.
> (__detail::__platform_wait_until): Remove previous
> definition.
> (__detail::__cond_wait_until_impl): Rename to
> __cond_wait_until.
> (__detail::__cond_wait_until): Remove previous
> definition.
> (__detail::__spin_until_impl): New function.
> (__detail::__wait_until_impl): New function.
> (__detail::__wait_until): New function.
> (__detail::__wait_for): New function.
> (__detail::__timed_waiter_pool): Remove type.
> (__detail::__timed_backoff_spin_policy): Remove type.
> (__detail::__timed_waiter): Remove type.
> (__detail::__enters_timed_wait): Remove type alias.
> (__detail::__bare_timed_wait): Remove type alias.
> (__atomic_wait_address_until): Adjust to new implementation
> detail.
> (__atomic_wait_address_until_v): Likewise.
> (__atomic_wait_address_bare): Remove.
> (__atomic_wait_address_for): Adjust to new implementation
> detail.
> (__atomic_wait_address_for_v): Likewise.
> (__atomic_wait_address_for_bare): Remove.
> * include/bits/atomic_wait.h: Include bits/stl_pair.h.
> (__detail::__default_spin_policy): Remove type.
> (__detail::__atomic_spin): Remove function.
> (__detail::__waiter_pool_base): Rename to __waiter_pool_impl.
> Remove _M_notify. Rename _S_for to _S_impl_for.
> (__detail::__waiter_base): Remove type.
> (__detail::__waiter_pool): Remove type.
> (__detail::__waiter): Remove type.
> (__detail::__enters_wait): Remove type alias.
> (__detail::__bare_wait): Remove type alias.
> (__detail::__wait_flags): New enum.
> (__detail::__wait_args): New struct.
> (__detail::__wait_result_type): New type alias.
> (__detail::__spin_impl): New function.
> (__detail::__wait_impl): New function.
> (__atomic_wait_address): Adjust to new implementation detail.
> (__atomic_wait_address_v): Likewise.
> (__atomic_notify_address): Likewise.
> (__atomic_wait_address_bare): Delete.
> (__atomic_notify_address_bare): Likewise.
> * include/bits/semaphore_base.h: Adjust implementation to
> use new __atomic_wait_address_v contract.
> * include/std/barrier: Adjust implementation to use new
> __atomic_wait contract.
> * include/std/latch: Adjust implementation to use new
> __atomic_wait contract.
> * testsuite/29_atomics/atomic/wait_notify/100334.cc (main):
> Adjust to for __detail::__waiter_pool_base renaming.
> ---
> libstdc++-v3/include/bits/atomic_timed_wait.h | 549 ++++++++----------
> libstdc++-v3/include/bits/atomic_wait.h | 486 ++++++++--------
> libstdc++-v3/include/bits/semaphore_base.h | 53 +-
> libstdc++-v3/include/std/barrier | 6 +-
> libstdc++-v3/include/std/latch | 5 +-
> .../29_atomics/atomic/wait_notify/100334.cc | 4 +-
> 6 files changed, 514 insertions(+), 589 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h
> b/libstdc++-v3/include/bits/atomic_timed_wait.h
> index 9a6ac95b7d0e..196548484024 100644
> --- a/libstdc++-v3/include/bits/atomic_timed_wait.h
> +++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
> @@ -76,62 +76,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
> #define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> // returns true if wait ended before timeout
> - template<typename _Dur>
> - bool
> - __platform_wait_until_impl(const __platform_wait_t* __addr,
> - __platform_wait_t __old,
> - const chrono::time_point<__wait_clock_t,
> _Dur>&
> - __atime) noexcept
> - {
> - auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
> - auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
> + bool
> + __platform_wait_until(const __platform_wait_t* __addr,
> + __platform_wait_t __old,
> + const __wait_clock_t::time_point& __atime) noexcept
> + {
> + auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
> + auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
>
> - struct timespec __rt =
> + struct timespec __rt =
> {
> static_cast<std::time_t>(__s.time_since_epoch().count()),
> static_cast<long>(__ns.count())
> };
>
> - auto __e = syscall (SYS_futex, __addr,
> - static_cast<int>(__futex_wait_flags::
> - __wait_bitset_private),
> - __old, &__rt, nullptr,
> - static_cast<int>(__futex_wait_flags::
> - __bitset_match_any));
> -
> - if (__e)
> - {
> - if (errno == ETIMEDOUT)
> - return false;
> - if (errno != EINTR && errno != EAGAIN)
> - __throw_system_error(errno);
> - }
> - return true;
> - }
> -
> - // returns true if wait ended before timeout
> - template<typename _Clock, typename _Dur>
> - bool
> - __platform_wait_until(const __platform_wait_t* __addr,
> __platform_wait_t __old,
> - const chrono::time_point<_Clock, _Dur>& __atime)
> - {
> - if constexpr (is_same_v<__wait_clock_t, _Clock>)
> - {
> - return __platform_wait_until_impl(__addr, __old, __atime);
> - }
> - else
> - {
> - if (!__platform_wait_until_impl(__addr, __old,
> - __to_wait_clock(__atime)))
> - {
> - // We got a timeout when measured against __clock_t but
> - // we need to check against the caller-supplied clock
> - // to tell whether we should return a timeout.
> - if (_Clock::now() < __atime)
> - return true;
> - }
> + auto __e = syscall (SYS_futex, __addr,
> +
> static_cast<int>(__futex_wait_flags::__wait_bitset_private),
> + __old, &__rt, nullptr,
> +
> static_cast<int>(__futex_wait_flags::__bitset_match_any));
> + if (__e)
> + {
> + if (errno == ETIMEDOUT)
> return false;
> - }
> + if (errno != EINTR && errno != EAGAIN)
> + __throw_system_error(errno);
> + }
> + return true;
> }
> #else
> // define _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT and implement
> __platform_wait_until()
> @@ -141,15 +111,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> #ifdef _GLIBCXX_HAS_GTHREADS
> // Returns true if wait ended before timeout.
> - // _Clock must be either steady_clock or system_clock.
> - template<typename _Clock, typename _Dur>
> - bool
> - __cond_wait_until_impl(__condvar& __cv, mutex& __mx,
> - const chrono::time_point<_Clock, _Dur>& __atime)
> - {
> - static_assert(std::__is_one_of<_Clock, chrono::steady_clock,
> - chrono::system_clock>::value);
> -
> + bool
> + __cond_wait_until(__condvar& __cv, mutex& __mx,
> + const __wait_clock_t::time_point& __atime)
> + {
> auto __s = chrono::time_point_cast<chrono::seconds>(__atime);
> auto __ns = chrono::duration_cast<chrono::nanoseconds>(__atime - __s);
>
> @@ -160,293 +125,261 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> };
>
> #ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
> - if constexpr (is_same_v<chrono::steady_clock, _Clock>)
> + if constexpr (is_same_v<chrono::steady_clock, __wait_clock_t>)
> __cv.wait_until(__mx, CLOCK_MONOTONIC, __ts);
> else
> #endif
> __cv.wait_until(__mx, __ts);
> - return _Clock::now() < __atime;
> - }
> -
> - // returns true if wait ended before timeout
> - template<typename _Clock, typename _Dur>
> - bool
> - __cond_wait_until(__condvar& __cv, mutex& __mx,
> - const chrono::time_point<_Clock, _Dur>& __atime)
> - {
> -#ifdef _GLIBCXX_USE_PTHREAD_COND_CLOCKWAIT
> - if constexpr (is_same_v<_Clock, chrono::steady_clock>)
> - return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
> - else
> -#endif
> - if constexpr (is_same_v<_Clock, chrono::system_clock>)
> - return __detail::__cond_wait_until_impl(__cv, __mx, __atime);
> - else
> - {
> - if (__cond_wait_until_impl(__cv, __mx,
> - __to_wait_clock(__atime)))
> - {
> - // We got a timeout when measured against __clock_t but
> - // we need to check against the caller-supplied clock
> - // to tell whether we should return a timeout.
> - if (_Clock::now() < __atime)
> - return true;
> - }
> - return false;
> - }
> + return __wait_clock_t::now() < __atime;
> }
> #endif // _GLIBCXX_HAS_GTHREADS
>
> - struct __timed_waiter_pool : __waiter_pool_base
> + inline __wait_result_type
> + __spin_until_impl(const __platform_wait_t* __addr, __wait_args __args,
> + const __wait_clock_t::time_point& __deadline)
> {
> - // returns true if wait ended before timeout
> - template<typename _Clock, typename _Dur>
> - bool
> - _M_do_wait_until(__platform_wait_t* __addr, __platform_wait_t __old,
> - const chrono::time_point<_Clock, _Dur>& __atime)
> + auto __t0 = __wait_clock_t::now();
> + using namespace literals::chrono_literals;
> +
> + __platform_wait_t __val;
> + auto __now = __wait_clock_t::now();
> + for (; __now < __deadline; __now = __wait_clock_t::now())
> + {
> + auto __elapsed = __now - __t0;
> +#ifndef _GLIBCXX_NO_SLEEP
> + if (__elapsed > 128ms)
> + {
> + this_thread::sleep_for(64ms);
> + }
> + else if (__elapsed > 64us)
> + {
> + this_thread::sleep_for(__elapsed / 2);
> + }
> + else
> +#endif
> + if (__elapsed > 4us)
> + {
> + __thread_yield();
> + }
> + else
> + {
> + auto __res = __detail::__spin_impl(__addr, __args);
> + if (__res.first)
> + return __res;
> + }
> +
> + __atomic_load(__addr, &__val, __args._M_order);
> + if (__val != __args._M_old)
> + return make_pair(true, __val);
> + }
> + return make_pair(false, __val);
> + }
> +
> + inline __wait_result_type
> + __wait_until_impl(const __platform_wait_t* __addr, __wait_args __args,
> + const __wait_clock_t::time_point& __atime)
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> + __waiter_pool_impl* __pool = nullptr;
> +#else
> + // if we don't have __platform_wait, we always need the side-table
> + __waiter_pool_impl* __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> +
> + __platform_wait_t* __wait_addr;
> + if (__args & __wait_flags::__proxy_wait)
> {
> #ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> - return __platform_wait_until(__addr, __old, __atime);
> -#else
> - __platform_wait_t __val;
> - __atomic_load(__addr, &__val, __ATOMIC_RELAXED);
> - if (__val == __old)
> - {
> - lock_guard<mutex> __l(_M_mtx);
> - return __cond_wait_until(_M_cv, _M_mtx, __atime);
> - }
> - else
> - return true;
> -#endif // _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> + __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> + __wait_addr = &__pool->_M_ver;
> + __atomic_load(__wait_addr, &__args._M_old, __args._M_order);
> }
> - };
> + else
> + __wait_addr = const_cast<__platform_wait_t*>(__addr);
>
> - struct __timed_backoff_spin_policy
> - {
> - __wait_clock_t::time_point _M_deadline;
> - __wait_clock_t::time_point _M_t0;
> + if (__args & __wait_flags::__do_spin)
> + {
> + auto __res = __detail::__spin_until_impl(__wait_addr, __args,
> __atime);
> + if (__res.first)
> + return __res;
> + if (__args & __wait_flags::__spin_only)
> + return __res;
> + }
>
> - template<typename _Clock, typename _Dur>
> - __timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur>
> - __deadline = _Clock::time_point::max(),
> - chrono::time_point<_Clock, _Dur>
> - __t0 = _Clock::now()) noexcept
> - : _M_deadline(__to_wait_clock(__deadline))
> - , _M_t0(__to_wait_clock(__t0))
> - { }
> -
> - bool
> - operator()() const noexcept
> + if (!(__args & __wait_flags::__track_contention))
> {
> - using namespace literals::chrono_literals;
> - auto __now = __wait_clock_t::now();
> - if (_M_deadline <= __now)
> - return false;
> -
> - // FIXME: this_thread::sleep_for not available #ifdef
> _GLIBCXX_NO_SLEEP
> -
> - auto __elapsed = __now - _M_t0;
> - if (__elapsed > 128ms)
> - {
> - this_thread::sleep_for(64ms);
> - }
> - else if (__elapsed > 64us)
> - {
> - this_thread::sleep_for(__elapsed / 2);
> - }
> - else if (__elapsed > 4us)
> - {
> - __thread_yield();
> - }
> - else
> - return false;
> - return true;
> + // caller does not externally track contention
> +#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> + __pool = (__pool == nullptr) ?
> &__waiter_pool_impl::_S_impl_for(__addr)
> + : __pool;
> +#endif
> + __pool->_M_enter_wait();
> }
> - };
>
> - template<typename _EntersWait>
> - struct __timed_waiter : __waiter_base<__timed_waiter_pool>
> + __wait_result_type __res;
> +#ifdef _GLIBCXX_HAVE_PLATFORM_TIMED_WAIT
> + if (__platform_wait_until(__wait_addr, __args._M_old, __atime))
> + __res = make_pair(true, __args._M_old);
> + else
> + __res = make_pair(false, __args._M_old);
> +#else
> + __platform_wait_t __val;
> + __atomic_load(__wait_addr, &__val, __args._M_order);
> + if (__val == __args._M_old)
> + {
> + lock_guard<mutex> __l{ __pool->_M_mtx };
> + __atomic_load(__wait_addr, &__val, __args._M_order);
> + if (__val == __args._M_old &&
> + __cond_wait_until(__pool->_M_cv, __pool->_M_mtx, __atime))
> + __res = make_pair(true, __val);
> + }
> + else
> + __res = make_pair(false, __val);
> +#endif
> +
> + if (!(__args & __wait_flags::__track_contention))
> + // caller does not externally track contention
> + __pool->_M_leave_wait();
> + return __res;
> + }
> +
> + template<typename _Clock, typename _Dur>
> + __wait_result_type
> + __wait_until(const __platform_wait_t* __addr, __wait_args __args,
> + const chrono::time_point<_Clock, _Dur>& __atime) noexcept
> {
> - using __base_type = __waiter_base<__timed_waiter_pool>;
> -
> - template<typename _Tp>
> - __timed_waiter(const _Tp* __addr) noexcept
> - : __base_type(__addr)
> - {
> - if constexpr (_EntersWait::value)
> - _M_w._M_enter_wait();
> - }
> -
> - ~__timed_waiter()
> - {
> - if constexpr (_EntersWait::value)
> - _M_w._M_leave_wait();
> - }
> -
> - // returns true if wait ended before timeout
> - template<typename _Tp, typename _ValFn,
> - typename _Clock, typename _Dur>
> - bool
> - _M_do_wait_until_v(_Tp __old, _ValFn __vfn,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime)
> noexcept
> + if constexpr (is_same_v<__wait_clock_t, _Clock>)
> + return __detail::__wait_until_impl(__addr, __args, __atime);
> + else
> {
> - __platform_wait_t __val;
> - if (_M_do_spin(__old, std::move(__vfn), __val,
> - __timed_backoff_spin_policy(__atime)))
> - return true;
> - return __base_type::_M_w._M_do_wait_until(__base_type::_M_addr,
> __val, __atime);
> - }
> + auto __res = __detail::__wait_until_impl(__addr, __args,
> + __to_wait_clock(__atime));
> + if (!__res.first)
> + {
> + // We got a timeout when measured against __clock_t but
> + // we need to check against the caller-supplied clock
> + // to tell whether we should return a timeout.
> + if (_Clock::now() < __atime)
> + return make_pair(true, __res.second);
> + }
> + return __res;
> + }
> + }
>
> - // returns true if wait ended before timeout
> - template<typename _Pred,
> - typename _Clock, typename _Dur>
> - bool
> - _M_do_wait_until(_Pred __pred, __platform_wait_t __val,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime)
> noexcept
> - {
> - for (auto __now = _Clock::now(); __now < __atime;
> - __now = _Clock::now())
> - {
> - if (__base_type::_M_w._M_do_wait_until(
> - __base_type::_M_addr, __val, __atime)
> - && __pred())
> - return true;
> -
> - if (__base_type::_M_do_spin(__pred, __val,
> - __timed_backoff_spin_policy(__atime, __now)))
> - return true;
> - }
> - return false;
> - }
> -
> - // returns true if wait ended before timeout
> - template<typename _Pred,
> - typename _Clock, typename _Dur>
> - bool
> - _M_do_wait_until(_Pred __pred,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime)
> noexcept
> - {
> - __platform_wait_t __val;
> - if (__base_type::_M_do_spin(__pred, __val,
> - __timed_backoff_spin_policy(__atime)))
> - return true;
> - return _M_do_wait_until(__pred, __val, __atime);
> - }
> -
> - template<typename _Tp, typename _ValFn,
> - typename _Rep, typename _Period>
> - bool
> - _M_do_wait_for_v(_Tp __old, _ValFn __vfn,
> - const chrono::duration<_Rep, _Period>&
> - __rtime)
> noexcept
> - {
> - __platform_wait_t __val;
> - if (_M_do_spin_v(__old, std::move(__vfn), __val))
> - return true;
> -
> - if (!__rtime.count())
> - return false; // no rtime supplied, and spin did not acquire
> -
> - auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
> -
> - return __base_type::_M_w._M_do_wait_until(
> - __base_type::_M_addr,
> - __val,
> - chrono::steady_clock::now() +
> __reltime);
> - }
> -
> - template<typename _Pred,
> - typename _Rep, typename _Period>
> - bool
> - _M_do_wait_for(_Pred __pred,
> - const chrono::duration<_Rep, _Period>& __rtime)
> noexcept
> - {
> - __platform_wait_t __val;
> - if (__base_type::_M_do_spin(__pred, __val))
> - return true;
> -
> - if (!__rtime.count())
> - return false; // no rtime supplied, and spin did not acquire
> -
> - auto __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
> -
> - return _M_do_wait_until(__pred, __val,
> - chrono::steady_clock::now() + __reltime);
> - }
> - };
> -
> - using __enters_timed_wait = __timed_waiter<std::true_type>;
> - using __bare_timed_wait = __timed_waiter<std::false_type>;
> + template<typename _Rep, typename _Period>
> + __wait_result_type
> + __wait_for(const __platform_wait_t* __addr, __wait_args __args,
> + const chrono::duration<_Rep, _Period>& __rtime) noexcept
> + {
> + if (!__rtime.count())
> + // no rtime supplied, just spin a bit
> + return __detail::__wait_impl(__addr, __args |
> __wait_flags::__spin_only);
> + auto const __reltime = chrono::ceil<__wait_clock_t::duration>(__rtime);
> + auto const __atime = chrono::steady_clock::now() + __reltime;
> + return __detail::__wait_until(__addr, __args, __atime);
> + }
> } // namespace __detail
>
> // returns true if wait ended before timeout
> + template<typename _Tp,
> + typename _Pred, typename _ValFn,
> + typename _Clock, typename _Dur>
> + bool
> + __atomic_wait_address_until(const _Tp* __addr, _Pred&& __pred,
> + _ValFn&& __vfn,
> + const chrono::time_point<_Clock, _Dur>&
> __atime,
> + bool __bare_wait = false) noexcept
> + {
> + const auto __wait_addr =
> + reinterpret_cast<const __detail::__platform_wait_t*>(__addr);
> + __detail::__wait_args __args{ __addr, __bare_wait };
> + _Tp __val = __vfn();
> + while (!__pred(__val))
> + {
> + auto __res = __detail::__wait_until(__wait_addr, __args, __atime);
> + if (!__res.first)
> + // timed out
> + return __res.first; // C++26 will also return last observed __val
> + __val = __vfn();
> + }
> + return true; // C++26 will also return last observed __val
> + }
> +
> + template<typename _Clock, typename _Dur>
> + bool
> + __atomic_wait_address_until_v(const __detail::__platform_wait_t* __addr,
> + __detail::__platform_wait_t __old,
> + int __order,
> + const chrono::time_point<_Clock, _Dur>&
> __atime,
> + bool __bare_wait = false) noexcept
> + {
> + __detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
> + auto __res = __detail::__wait_until(__addr, __args, __atime);
> + return __res.first; // C++26 will also return last observed __val
> + }
> +
> template<typename _Tp, typename _ValFn,
> typename _Clock, typename _Dur>
> bool
> __atomic_wait_address_until_v(const _Tp* __addr, _Tp&& __old, _ValFn&&
> __vfn,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime) noexcept
> + const chrono::time_point<_Clock, _Dur>&
> __atime,
> + bool __bare_wait = false) noexcept
> {
> - __detail::__enters_timed_wait __w{__addr};
> - return __w._M_do_wait_until_v(__old, __vfn, __atime);
> + auto __pfn = [&](const _Tp& __val)
> + { return !__detail::__atomic_compare(__old, __val); };
> + return __atomic_wait_address_until(__addr, __pfn,
> forward<_ValFn>(__vfn),
> + __atime, __bare_wait);
> }
>
> - template<typename _Tp, typename _Pred,
> - typename _Clock, typename _Dur>
> - bool
> - __atomic_wait_address_until(const _Tp* __addr, _Pred __pred,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime)
> noexcept
> - {
> - __detail::__enters_timed_wait __w{__addr};
> - return __w._M_do_wait_until(__pred, __atime);
> - }
> + template<typename _Tp,
> + typename _Pred, typename _ValFn,
> + typename _Rep, typename _Period>
> + bool
> + __atomic_wait_address_for(const _Tp* __addr, _Pred&& __pred,
> + _ValFn&& __vfn,
> + const chrono::duration<_Rep, _Period>& __rtime,
> + bool __bare_wait = false) noexcept
> + {
> + const auto __wait_addr =
> + reinterpret_cast<const __detail::__platform_wait_t*>(__addr);
> + __detail::__wait_args __args{ __addr, __bare_wait };
> + _Tp __val = __vfn();
> + while (!__pred(__val))
> + {
> + auto __res = __detail::__wait_for(__wait_addr, __args, __rtime);
> + if (!__res.first)
> + // timed out
> + return __res.first; // C++26 will also return last observed __val
> + __val = __vfn();
> + }
> + return true; // C++26 will also return last observed __val
> + }
>
> - template<typename _Pred,
> - typename _Clock, typename _Dur>
> + template<typename _Rep, typename _Period>
> bool
> - __atomic_wait_address_until_bare(const __detail::__platform_wait_t*
> __addr,
> - _Pred __pred,
> - const chrono::time_point<_Clock, _Dur>&
> - __atime)
> noexcept
> - {
> - __detail::__bare_timed_wait __w{__addr};
> - return __w._M_do_wait_until(__pred, __atime);
> - }
> + __atomic_wait_address_for_v(const __detail::__platform_wait_t* __addr,
> + __detail::__platform_wait_t __old,
> + int __order,
> + const chrono::time_point<_Rep, _Period>&
> __rtime,
> + bool __bare_wait = false) noexcept
> + {
> + __detail::__wait_args __args{ __addr, __old, __order, __bare_wait };
> + auto __res = __detail::__wait_for(__addr, __args, __rtime);
> + return __res.first; // C++26 will also return last observed __Val
> + }
>
> template<typename _Tp, typename _ValFn,
> typename _Rep, typename _Period>
> bool
> __atomic_wait_address_for_v(const _Tp* __addr, _Tp&& __old, _ValFn&&
> __vfn,
> - const chrono::duration<_Rep, _Period>& __rtime) noexcept
> + const chrono::duration<_Rep, _Period>&
> __rtime,
> + bool __bare_wait = false) noexcept
> {
> - __detail::__enters_timed_wait __w{__addr};
> - return __w._M_do_wait_for_v(__old, __vfn, __rtime);
> - }
> -
> - template<typename _Tp, typename _Pred,
> - typename _Rep, typename _Period>
> - bool
> - __atomic_wait_address_for(const _Tp* __addr, _Pred __pred,
> - const chrono::duration<_Rep, _Period>& __rtime) noexcept
> - {
> -
> - __detail::__enters_timed_wait __w{__addr};
> - return __w._M_do_wait_for(__pred, __rtime);
> - }
> -
> - template<typename _Pred,
> - typename _Rep, typename _Period>
> - bool
> - __atomic_wait_address_for_bare(const __detail::__platform_wait_t* __addr,
> - _Pred __pred,
> - const chrono::duration<_Rep, _Period>& __rtime)
> noexcept
> - {
> - __detail::__bare_timed_wait __w{__addr};
> - return __w._M_do_wait_for(__pred, __rtime);
> + auto __pfn = [&](const _Tp& __val)
> + { return !__detail::__atomic_compare(__old, __val); };
> + return __atomic_wait_address_for(__addr, __pfn, forward<_ValFn>(__vfn),
> + __rtime, __bare_wait);
> }
> _GLIBCXX_END_NAMESPACE_VERSION
> } // namespace std
> diff --git a/libstdc++-v3/include/bits/atomic_wait.h
> b/libstdc++-v3/include/bits/atomic_wait.h
> index 6d1554f68a56..18cfc2ef7bd2 100644
> --- a/libstdc++-v3/include/bits/atomic_wait.h
> +++ b/libstdc++-v3/include/bits/atomic_wait.h
> @@ -50,7 +50,8 @@
> # include <bits/functexcept.h>
> #endif
>
> -# include <bits/std_mutex.h> // std::mutex, std::__condvar
> +#include <bits/stl_pair.h>
> +#include <bits/std_mutex.h> // std::mutex, std::__condvar
>
> namespace std _GLIBCXX_VISIBILITY(default)
> {
> @@ -134,7 +135,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> __thread_yield() noexcept
> {
> #if defined _GLIBCXX_HAS_GTHREADS && defined _GLIBCXX_USE_SCHED_YIELD
> - __gthread_yield();
> + __gthread_yield();
> #endif
> }
>
> @@ -151,38 +152,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> inline constexpr auto __atomic_spin_count_relax = 12;
> inline constexpr auto __atomic_spin_count = 16;
>
> - struct __default_spin_policy
> - {
> - bool
> - operator()() const noexcept
> - { return false; }
> - };
> -
> - template<typename _Pred,
> - typename _Spin = __default_spin_policy>
> - bool
> - __atomic_spin(_Pred& __pred, _Spin __spin = _Spin{ }) noexcept
> - {
> - for (auto __i = 0; __i < __atomic_spin_count; ++__i)
> - {
> - if (__pred())
> - return true;
> -
> - if (__i < __atomic_spin_count_relax)
> - __detail::__thread_relax();
> - else
> - __detail::__thread_yield();
> - }
> -
> - while (__spin())
> - {
> - if (__pred())
> - return true;
> - }
> -
> - return false;
> - }
> -
> // return true if equal
> template<typename _Tp>
> bool __atomic_compare(const _Tp& __a, const _Tp& __b)
> @@ -191,7 +160,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> return __builtin_memcmp(&__a, &__b, sizeof(_Tp)) == 0;
> }
>
> - struct __waiter_pool_base
> + struct __waiter_pool_impl
> {
> // Don't use std::hardware_destructive_interference_size here because
> we
> // don't want the layout of library types to depend on compiler
> options.
> @@ -208,7 +177,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> #ifndef _GLIBCXX_HAVE_PLATFORM_WAIT
> __condvar _M_cv;
> #endif
> - __waiter_pool_base() = default;
> + __waiter_pool_impl() = default;
>
> void
> _M_enter_wait() noexcept
> @@ -226,256 +195,271 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> return __res != 0;
> }
>
> - void
> - _M_notify(__platform_wait_t* __addr, [[maybe_unused]] bool __all,
> - bool __bare) noexcept
> - {
> -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> - if (__addr == &_M_ver)
> - {
> - __atomic_fetch_add(__addr, 1, __ATOMIC_SEQ_CST);
> - __all = true;
> - }
> -
> - if (__bare || _M_waiting())
> - __platform_notify(__addr, __all);
> -#else
> - {
> - lock_guard<mutex> __l(_M_mtx);
> - __atomic_fetch_add(__addr, 1, __ATOMIC_RELAXED);
> - }
> - if (__bare || _M_waiting())
> - _M_cv.notify_all();
> -#endif
> - }
> -
> - static __waiter_pool_base&
> - _S_for(const void* __addr) noexcept
> + static __waiter_pool_impl&
> + _S_impl_for(const void* __addr) noexcept
> {
> constexpr __UINTPTR_TYPE__ __ct = 16;
> - static __waiter_pool_base __w[__ct];
> + static __waiter_pool_impl __w[__ct];
> auto __key = ((__UINTPTR_TYPE__)__addr >> 2) % __ct;
> return __w[__key];
> }
> };
>
> - struct __waiter_pool : __waiter_pool_base
> + enum class __wait_flags : __UINT_LEAST32_TYPE__
> {
> - void
> - _M_do_wait(const __platform_wait_t* __addr, __platform_wait_t __old)
> noexcept
> - {
> -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> - __platform_wait(__addr, __old);
> -#else
> - __platform_wait_t __val;
> - __atomic_load(__addr, &__val, __ATOMIC_SEQ_CST);
> - if (__val == __old)
> - {
> - lock_guard<mutex> __l(_M_mtx);
> - __atomic_load(__addr, &__val, __ATOMIC_RELAXED);
> - if (__val == __old)
> - _M_cv.wait(_M_mtx);
> - }
> -#endif // __GLIBCXX_HAVE_PLATFORM_WAIT
> - }
> + __abi_version = 0,
> + __proxy_wait = 1,
> + __track_contention = 2,
> + __do_spin = 4,
> + __spin_only = 8 | __do_spin, // implies __do_spin
> + __abi_version_mask = 0xffff0000,
> };
>
> - template<typename _Tp>
> - struct __waiter_base
> + struct __wait_args
> + {
> + __platform_wait_t _M_old;
> + int _M_order = __ATOMIC_ACQUIRE;
> + __wait_flags _M_flags;
> +
> + template<typename _Tp>
> + explicit __wait_args(const _Tp* __addr,
> + bool __bare_wait = false) noexcept
> + : _M_flags{ _S_flags_for(__addr, __bare_wait) }
> + { }
> +
> + __wait_args(const __platform_wait_t* __addr, __platform_wait_t __old,
> + int __order, bool __bare_wait = false) noexcept
> + : _M_old{ __old }
> + , _M_order{ __order }
> + , _M_flags{ _S_flags_for(__addr, __bare_wait) }
> + { }
> +
> + __wait_args(const __wait_args&) noexcept = default;
> + __wait_args&
> + operator=(const __wait_args&) noexcept = default;
> +
> + bool
> + operator&(__wait_flags __flag) const noexcept
> {
> - using __waiter_type = _Tp;
> + using __t = underlying_type_t<__wait_flags>;
> + return static_cast<__t>(_M_flags)
> + & static_cast<__t>(__flag);
> + }
>
> - __waiter_type& _M_w;
> - __platform_wait_t* _M_addr;
> + __wait_args
> + operator|(__wait_flags __flag) const noexcept
> + {
> + using __t = underlying_type_t<__wait_flags>;
> + __wait_args __res{ *this };
> + const auto __flags = static_cast<__t>(__res._M_flags)
> + | static_cast<__t>(__flag);
> + __res._M_flags = __wait_flags{ __flags };
> + return __res;
> + }
>
> - template<typename _Up>
> - static __platform_wait_t*
> - _S_wait_addr(const _Up* __a, __platform_wait_t* __b)
> - {
> - if constexpr (__platform_wait_uses_type<_Up>)
> - return
> reinterpret_cast<__platform_wait_t*>(const_cast<_Up*>(__a));
> - else
> - return __b;
> - }
> + private:
> + static int
> + constexpr _S_default_flags() noexcept
> + {
> + using __t = underlying_type_t<__wait_flags>;
> + return static_cast<__t>(__wait_flags::__abi_version)
> + | static_cast<__t>(__wait_flags::__do_spin);
> + }
>
> - static __waiter_type&
> - _S_for(const void* __addr) noexcept
> + template<typename _Tp>
> + static int
> + constexpr _S_flags_for(const _Tp*, bool __bare_wait) noexcept
> {
> - static_assert(sizeof(__waiter_type) == sizeof(__waiter_pool_base));
> - auto& res = __waiter_pool_base::_S_for(__addr);
> - return reinterpret_cast<__waiter_type&>(res);
> + auto __res = _S_default_flags();
> + if (!__bare_wait)
> + __res |= static_cast<int>(__wait_flags::__track_contention);
> + if constexpr (!__platform_wait_uses_type<_Tp>)
> + __res |= static_cast<int>(__wait_flags::__proxy_wait);
> + return __res;
> }
>
> - template<typename _Up>
> - explicit __waiter_base(const _Up* __addr) noexcept
> - : _M_w(_S_for(__addr))
> - , _M_addr(_S_wait_addr(__addr, &_M_w._M_ver))
> - { }
> -
> - void
> - _M_notify(bool __all, bool __bare = false) noexcept
> - { _M_w._M_notify(_M_addr, __all, __bare); }
> -
> - template<typename _Up, typename _ValFn,
> - typename _Spin = __default_spin_policy>
> - static bool
> - _S_do_spin_v(__platform_wait_t* __addr,
> - const _Up& __old, _ValFn __vfn,
> - __platform_wait_t& __val,
> - _Spin __spin = _Spin{ })
> - {
> - auto const __pred = [=]
> - { return !__detail::__atomic_compare(__old, __vfn()); };
> -
> - if constexpr (__platform_wait_uses_type<_Up>)
> - {
> - __builtin_memcpy(&__val, &__old, sizeof(__val));
> - }
> - else
> - {
> - __atomic_load(__addr, &__val, __ATOMIC_ACQUIRE);
> - }
> - return __atomic_spin(__pred, __spin);
> - }
> -
> - template<typename _Up, typename _ValFn,
> - typename _Spin = __default_spin_policy>
> - bool
> - _M_do_spin_v(const _Up& __old, _ValFn __vfn,
> - __platform_wait_t& __val,
> - _Spin __spin = _Spin{ })
> - { return _S_do_spin_v(_M_addr, __old, __vfn, __val, __spin); }
> -
> - template<typename _Pred,
> - typename _Spin = __default_spin_policy>
> - static bool
> - _S_do_spin(const __platform_wait_t* __addr,
> - _Pred __pred,
> - __platform_wait_t& __val,
> - _Spin __spin = _Spin{ })
> - {
> - __atomic_load(__addr, &__val, __ATOMIC_ACQUIRE);
> - return __atomic_spin(__pred, __spin);
> - }
> -
> - template<typename _Pred,
> - typename _Spin = __default_spin_policy>
> - bool
> - _M_do_spin(_Pred __pred, __platform_wait_t& __val,
> - _Spin __spin = _Spin{ })
> - { return _S_do_spin(_M_addr, __pred, __val, __spin); }
> - };
> -
> - template<typename _EntersWait>
> - struct __waiter : __waiter_base<__waiter_pool>
> - {
> - using __base_type = __waiter_base<__waiter_pool>;
> -
> - template<typename _Tp>
> - explicit __waiter(const _Tp* __addr) noexcept
> - : __base_type(__addr)
> - {
> - if constexpr (_EntersWait::value)
> - _M_w._M_enter_wait();
> - }
> -
> - ~__waiter()
> + template<typename _Tp>
> + static int
> + _S_memory_order_for(const _Tp*, int __order) noexcept
> {
> - if constexpr (_EntersWait::value)
> - _M_w._M_leave_wait();
> + if constexpr (__platform_wait_uses_type<_Tp>)
> + return __order;
> + return __ATOMIC_ACQUIRE;
> + }
> + };
> +
> + using __wait_result_type = pair<bool, __platform_wait_t>;
> + inline __wait_result_type
> + __spin_impl(const __platform_wait_t* __addr, __wait_args __args)
> + {
> + __platform_wait_t __val;
> + for (auto __i = 0; __i < __atomic_spin_count; ++__i)
> + {
> + __atomic_load(__addr, &__val, __args._M_order);
> + if (__val != __args._M_old)
> + return make_pair(true, __val);
> + if (__i < __atomic_spin_count_relax)
> + __detail::__thread_relax();
> + else
> + __detail::__thread_yield();
> + }
> + return make_pair(false, __val);
> + }
> +
> + inline __wait_result_type
> + __wait_impl(const __platform_wait_t* __addr, __wait_args __args)
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __waiter_pool_impl* __pool = nullptr;
> +#else
> + // if we don't have __platform_wait, we always need the side-table
> + __waiter_pool_impl* __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> +
> + __platform_wait_t* __wait_addr;
> + if (__args & __wait_flags::__proxy_wait)
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> + __wait_addr = &__pool->_M_ver;
> + __atomic_load(__wait_addr, &__args._M_old, __args._M_order);
> + }
> + else
> + __wait_addr = const_cast<__platform_wait_t*>(__addr);
> +
> + if (__args & __wait_flags::__do_spin)
> + {
> + auto __res = __detail::__spin_impl(__wait_addr, __args);
> + if (__res.first)
> + return __res;
> + if (__args & __wait_flags::__spin_only)
> + return __res;
> }
>
> - template<typename _Tp, typename _ValFn>
> - void
> - _M_do_wait_v(_Tp __old, _ValFn __vfn)
> - {
> - do
> - {
> - __platform_wait_t __val;
> - if (__base_type::_M_do_spin_v(__old, __vfn, __val))
> - return;
> - __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
> - }
> - while (__detail::__atomic_compare(__old, __vfn()));
> - }
> + if (!(__args & __wait_flags::__track_contention))
> + {
> + // caller does not externally track contention
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __pool = (__pool == nullptr) ?
> &__waiter_pool_impl::_S_impl_for(__addr)
> + : __pool;
> +#endif
> + __pool->_M_enter_wait();
> + }
>
> - template<typename _Pred>
> - void
> - _M_do_wait(_Pred __pred) noexcept
> - {
> - do
> - {
> - __platform_wait_t __val;
> - if (__base_type::_M_do_spin(__pred, __val))
> - return;
> - __base_type::_M_w._M_do_wait(__base_type::_M_addr, __val);
> - }
> - while (!__pred());
> - }
> - };
> + __wait_result_type __res;
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __platform_wait(__wait_addr, __args._M_old);
> + __res = make_pair(false, __args._M_old);
> +#else
> + __platform_wait_t __val;
> + __atomic_load(__wait_addr, &__val, __args._M_order);
> + if (__val == __args._M_old)
> + {
> + lock_guard<mutex> __l{ __pool->_M_mtx };
> + __atomic_load(__wait_addr, &__val, __args._M_order);
> + if (__val == __args._M_old)
> + __pool->_M_cv.wait(__pool->_M_mtx);
> + }
> + __res = make_pair(false, __val);
> +#endif
>
> - using __enters_wait = __waiter<std::true_type>;
> - using __bare_wait = __waiter<std::false_type>;
> + if (!(__args & __wait_flags::__track_contention))
> + // caller does not externally track contention
> + __pool->_M_leave_wait();
> + return __res;
> + }
> +
> + inline void
> + __notify_impl(const __platform_wait_t* __addr, [[maybe_unused]] bool
> __all,
> + __wait_args __args)
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __waiter_pool_impl* __pool = nullptr;
> +#else
> + // if we don't have __platform_notify, we always need the side-table
> + __waiter_pool_impl* __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> +
> + if (!(__args & __wait_flags::__track_contention))
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __pool = &__waiter_pool_impl::_S_impl_for(__addr);
> +#endif
> + if (!__pool->_M_waiting())
> + return;
> + }
> +
> + __platform_wait_t* __wait_addr;
> + if (__args & __wait_flags::__proxy_wait)
> + {
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __pool = (__pool == nullptr) ?
> &__waiter_pool_impl::_S_impl_for(__addr)
> + : __pool;
> +#endif
> + __wait_addr = &__pool->_M_ver;
> + __atomic_fetch_add(__wait_addr, 1, __ATOMIC_RELAXED);
> + __all = true;
> + }
> +
> +#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> + __platform_notify(__wait_addr, __all);
> +#else
> + lock_guard<mutex> __l{ __pool->_M_mtx };
> + __pool->_M_cv.notify_all();
> +#endif
> + }
> } // namespace __detail
>
> + template<typename _Tp,
> + typename _Pred, typename _ValFn>
> + void
> + __atomic_wait_address(const _Tp* __addr,
> + _Pred&& __pred, _ValFn&& __vfn,
> + bool __bare_wait = false) noexcept
> + {
> + const auto __wait_addr =
> + reinterpret_cast<const __detail::__platform_wait_t*>(__addr);
> + __detail::__wait_args __args{ __addr, __bare_wait };
> + _Tp __val = __vfn();
> + while (!__pred(__val))
> + {
There's a serious race condition here, resulting in missed
notifications and causing __atomic_wait_address to block forever.
The problem case is for proxied waits, where __wait_impl waits for the
value of _M_ver to be incremented by __notify_impl.
Let's say thread A checks __pred(__val) here and determines it hasn't
changed yet, then gets time-sliced and thread B runs. Thread B changes
__val and calls notify() which increments _M_ver. Then thread A
continues, calling __wait_impl which loads _M_ver twice, decides it
hasn't changed, and goes to sleep forever waiting for it to change.
Thread A will never wake up, because thread B already sent its
notification. Thread A will never notice that __val has changed,
because it's blocked waiting for _M_ver to change, so never gets to
check __pred(__val) again.
Before this refactoring, the load of _M_ver happened before checking
__pred(__val), which avoids the problem (as long as the value of
_M_ver doesn't wrap and
> + __detail::__wait_impl(__wait_addr, __args);
> + __val = __vfn();
> + }
> + // C++26 will return __val
> + }
> +
> + inline void
> + __atomic_wait_address_v(const __detail::__platform_wait_t* __addr,
> + __detail::__platform_wait_t __old,
> + int __order)
> + {
> + __detail::__wait_args __args{ __addr, __old, __order };
> + // C++26 will not ignore the return value here
> + __detail::__wait_impl(__addr, __args);
> + }
> +
> template<typename _Tp, typename _ValFn>
> void
> __atomic_wait_address_v(const _Tp* __addr, _Tp __old,
> _ValFn __vfn) noexcept
> {
> - __detail::__enters_wait __w(__addr);
> - __w._M_do_wait_v(__old, __vfn);
> - }
> -
> - template<typename _Tp, typename _Pred>
> - void
> - __atomic_wait_address(const _Tp* __addr, _Pred __pred) noexcept
> - {
> - __detail::__enters_wait __w(__addr);
> - __w._M_do_wait(__pred);
> - }
> -
> - // This call is to be used by atomic types which track contention
> externally
> - template<typename _Pred>
> - void
> - __atomic_wait_address_bare(const __detail::__platform_wait_t* __addr,
> - _Pred __pred) noexcept
> - {
> -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> - do
> - {
> - __detail::__platform_wait_t __val;
> - if (__detail::__bare_wait::_S_do_spin(__addr, __pred, __val))
> - return;
> - __detail::__platform_wait(__addr, __val);
> - }
> - while (!__pred());
> -#else // !_GLIBCXX_HAVE_PLATFORM_WAIT
> - __detail::__bare_wait __w(__addr);
> - __w._M_do_wait(__pred);
> -#endif
> + auto __pfn = [&](const _Tp& __val)
> + { return !__detail::__atomic_compare(__old, __val); };
> + __atomic_wait_address(__addr, __pfn, forward<_ValFn>(__vfn));
> }
>
> template<typename _Tp>
> void
> - __atomic_notify_address(const _Tp* __addr, bool __all) noexcept
> + __atomic_notify_address(const _Tp* __addr, bool __all,
> + bool __bare_wait = false) noexcept
> {
> - __detail::__bare_wait __w(__addr);
> - __w._M_notify(__all);
> + const auto __wait_addr =
> + reinterpret_cast<const __detail::__platform_wait_t*>(__addr);
> + __detail::__wait_args __args{ __addr, __bare_wait };
> + __detail::__notify_impl(__wait_addr, __all, __args);
> }
> -
> - // This call is to be used by atomic types which track contention
> externally
> - inline void
> - __atomic_notify_address_bare(const __detail::__platform_wait_t* __addr,
> - bool __all) noexcept
> - {
> -#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
> - __detail::__platform_notify(__addr, __all);
> -#else
> - __detail::__bare_wait __w(__addr);
> - __w._M_notify(__all, true);
> -#endif
> - }
> _GLIBCXX_END_NAMESPACE_VERSION
> } // namespace std
> #endif // __glibcxx_atomic_wait
> diff --git a/libstdc++-v3/include/bits/semaphore_base.h
> b/libstdc++-v3/include/bits/semaphore_base.h
> index d8f9bd8982bf..444a1589fb5a 100644
> --- a/libstdc++-v3/include/bits/semaphore_base.h
> +++ b/libstdc++-v3/include/bits/semaphore_base.h
> @@ -181,10 +181,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> __atomic_semaphore(const __atomic_semaphore&) = delete;
> __atomic_semaphore& operator=(const __atomic_semaphore&) = delete;
>
> - static _GLIBCXX_ALWAYS_INLINE bool
> - _S_do_try_acquire(__detail::__platform_wait_t* __counter) noexcept
> + static _GLIBCXX_ALWAYS_INLINE __detail::__platform_wait_t
> + _S_get_current(__detail::__platform_wait_t* __counter) noexcept
> + {
> + return __atomic_impl::load(__counter, memory_order::acquire);
> + }
> +
> + static _GLIBCXX_ALWAYS_INLINE bool
> + _S_do_try_acquire(__detail::__platform_wait_t* __counter,
> + __detail::__platform_wait_t __old) noexcept
> {
> - auto __old = __atomic_impl::load(__counter, memory_order::acquire);
> if (__old == 0)
> return false;
>
> @@ -197,17 +203,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> _GLIBCXX_ALWAYS_INLINE void
> _M_acquire() noexcept
> {
> - auto const __pred =
> - [this] { return _S_do_try_acquire(&this->_M_counter); };
> - std::__atomic_wait_address_bare(&_M_counter, __pred);
> + auto const __vfn = [this]{ return _S_get_current(&this->_M_counter); };
> + auto const __pred = [this](__detail::__platform_wait_t __cur)
> + { return _S_do_try_acquire(&this->_M_counter, __cur); };
> + std::__atomic_wait_address(&_M_counter, __pred, __vfn, true);
> }
>
> bool
> _M_try_acquire() noexcept
> {
> - auto const __pred =
> - [this] { return _S_do_try_acquire(&this->_M_counter); };
> - return std::__detail::__atomic_spin(__pred);
> + auto const __vfn = [this]{ return _S_get_current(&this->_M_counter); };
> + auto const __pred = [this](__detail::__platform_wait_t __cur)
> + { return _S_do_try_acquire(&this->_M_counter, __cur); };
> + return __atomic_wait_address_for(&_M_counter, __pred, __vfn,
> + __detail::__wait_clock_t::duration(),
> + true);
> }
>
> template<typename _Clock, typename _Duration>
> @@ -215,21 +225,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> _M_try_acquire_until(const chrono::time_point<_Clock,
> _Duration>& __atime) noexcept
> {
> - auto const __pred =
> - [this] { return _S_do_try_acquire(&this->_M_counter); };
> -
> - return __atomic_wait_address_until_bare(&_M_counter, __pred, __atime);
> + auto const __vfn = [this]{ return _S_get_current(&this->_M_counter);
> };
> + auto const __pred = [this](__detail::__platform_wait_t __cur)
> + { return _S_do_try_acquire(&this->_M_counter, __cur); };
> + return std::__atomic_wait_address_until(&_M_counter,
> + __pred, __vfn, __atime, true);
> }
>
> template<typename _Rep, typename _Period>
> _GLIBCXX_ALWAYS_INLINE bool
> - _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
> - noexcept
> + _M_try_acquire_for(const chrono::duration<_Rep, _Period>& __rtime)
> noexcept
> {
> - auto const __pred =
> - [this] { return _S_do_try_acquire(&this->_M_counter); };
> -
> - return __atomic_wait_address_for_bare(&_M_counter, __pred, __rtime);
> + auto const __vfn = [this]{ return _S_get_current(&this->_M_counter);
> };
> + auto const __pred = [this](__detail::__platform_wait_t __cur)
> + { return _S_do_try_acquire(&this->_M_counter, __cur); };
> + return std::__atomic_wait_address_for(&_M_counter,
> + __pred, __vfn, __rtime, true);
> }
>
> _GLIBCXX_ALWAYS_INLINE void
> @@ -238,9 +249,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> if (0 < __atomic_impl::fetch_add(&_M_counter, __update,
> memory_order_release))
> return;
> if (__update > 1)
> - __atomic_notify_address_bare(&_M_counter, true);
> + __atomic_notify_address(&_M_counter, true, true);
> else
> - __atomic_notify_address_bare(&_M_counter, true);
> + __atomic_notify_address(&_M_counter, true, true);
> // FIXME - Figure out why this does not wake a waiting thread
> // __atomic_notify_address_bare(&_M_counter, false);
> }
> diff --git a/libstdc++-v3/include/std/barrier
> b/libstdc++-v3/include/std/barrier
> index 6c3cfd44697c..62b03d0223f4 100644
> --- a/libstdc++-v3/include/std/barrier
> +++ b/libstdc++-v3/include/std/barrier
> @@ -194,11 +194,7 @@ It looks different from literature pseudocode for two
> main reasons:
> wait(arrival_token&& __old_phase) const
> {
> __atomic_phase_const_ref_t __phase(_M_phase);
> - auto const __test_fn = [=]
> - {
> - return __phase.load(memory_order_acquire) != __old_phase;
> - };
> - std::__atomic_wait_address(&_M_phase, __test_fn);
> + __phase.wait(__old_phase, memory_order_acquire);
> }
>
> void
> diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
> index 9220580613d2..de0afd8989bf 100644
> --- a/libstdc++-v3/include/std/latch
> +++ b/libstdc++-v3/include/std/latch
> @@ -78,8 +78,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> _GLIBCXX_ALWAYS_INLINE void
> wait() const noexcept
> {
> - auto const __pred = [this] { return this->try_wait(); };
> - std::__atomic_wait_address(&_M_a, __pred);
> + auto const __vfn = [this] { return this->try_wait(); };
> + auto const __pred = [this](bool __b) { return __b; };
> + std::__atomic_wait_address(&_M_a, __pred, __vfn);
> }
>
> _GLIBCXX_ALWAYS_INLINE void
> diff --git a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
> b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
> index 018c0c98d0ec..ec596e316500 100644
> --- a/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
> +++ b/libstdc++-v3/testsuite/29_atomics/atomic/wait_notify/100334.cc
> @@ -54,8 +54,8 @@ main()
> atom->store(0);
> }
>
> - auto a = &std::__detail::__waiter_pool_base::_S_for(reinterpret_cast<char
> *>(atomics.a[0]));
> - auto b = &std::__detail::__waiter_pool_base::_S_for(reinterpret_cast<char
> *>(atomics.a[1]));
> + auto a =
> &std::__detail::__waiter_pool_impl::_S_impl_for(reinterpret_cast<char
> *>(atomics.a[0]));
> + auto b =
> &std::__detail::__waiter_pool_impl::_S_impl_for(reinterpret_cast<char
> *>(atomics.a[1]));
> VERIFY( a == b );
>
> auto fut0 = std::async(std::launch::async, [&] { atomics.a[0]->wait(0); });
> --
> 2.47.1
>