Re: [PATCH v2] libstdc++: Make atomic>::wait sensitive to stored pointer only changes [PR118757]

Jonathan Wakely Fri, 17 Oct 2025 22:24:24 -0700

On Tue, 14 Oct 2025 at 11:38 +0100, Jonathan Wakely wrote:

On Thu, 21 Aug 2025 at 08:22 +0200, Tomasz Kamiński wrote:

Previously, atomic<shared_ptr<T>>::wait (and the weak_ptr version) was
equivalent to waiting directly on _M_val, which corresponds to the pointer to
the control block (_M_pi). Consequently, wakeups were not triggered if the 
stored
pointer value was changed to a pointer that uses the same control block but
stores pointer to a different object. Such a pointer can be constructed using an
aliasing constructor.


To address this, wait now uses a generic proxy wait std::__atomic_wait_address
function, which supports waiting until any predicate is satisfied. The provided
predicate now compares both the control block (_M_pi) and the stored pointer
(_M_ptr). Comparing the latter requires locking the pointer.

Since this function operates on raw pointers, the type of _M_val was changed
from __atomic_base<uintptr_t> to uninptr. Invocations of the corresponding


s/uninptr/uintptr_t/

member functions are now replaced with direct use of __atomic builtins.


The text above doesn't fit in 80 columns, and will look even worse in
the output of 'git log' where it's indented by another 4 columns.

See https://cbea.ms/git-commit/#wrap-72 for recommendations on commit
message formatting.


        PR libstdc++/118757

libstdc++-v3/ChangeLog:

        * include/bits/shared_ptr_atomic.h (_Atomic_count::_M_wait_unlock):
        Add parameter capturing reference to _M_ptr. Reimplement in terms
        of __atomic_wait_address.
        (_Atomic_count::~_Atomic_count, _Atomic_count::lock)
        (_Atomic_count::unlock, _Atomic_count::_M_swap_unlock): Replace
        invocation of atomic member funcitons with __atomic builtins.
        (_Atomic_count::notify_one, _Atomic_count::notify_all):
        Use __atomic_notify_address.
        (_Sp_atomic::element_type): Define.
        (_Sp_atomic::_M_val): Change type to uintptr_t.
        (_Sp_atomic::wait): Pass _M_ptr to _M_wait_unlock.
        * python/libstdcxx/v6/printers.py:
        * testsuite/20_util/shared_ptr/atomic/pr118757.cc: New test.
        * testsuite/20_util/weak_ptr/pr118757.cc: New test.

Signed-off-by: Tomasz Kamiński <[email protected]>
---
Changes in v2:
* update pretty printers
* use __atomic_sub_fetch in _M_wait_unlock

OK for trunk?
As atomic wait fixes are in v16, not planning to backport this one.

libstdc++-v3/include/bits/shared_ptr_atomic.h | 60 +++++++++++++------
libstdc++-v3/python/libstdcxx/v6/printers.py  |  2 +-
.../20_util/shared_ptr/atomic/pr118757.cc     | 46 ++++++++++++++
.../testsuite/20_util/weak_ptr/pr118757.cc    | 47 +++++++++++++++
4 files changed, 137 insertions(+), 18 deletions(-)
create mode 100644 libstdc++-v3/testsuite/20_util/shared_ptr/atomic/pr118757.cc
create mode 100644 libstdc++-v3/testsuite/20_util/weak_ptr/pr118757.cc

diff --git a/libstdc++-v3/include/bits/shared_ptr_atomic.h 
b/libstdc++-v3/include/bits/shared_ptr_atomic.h
index cc7841a8775..5f1e1b50514 100644
--- a/libstdc++-v3/include/bits/shared_ptr_atomic.h
+++ b/libstdc++-v3/include/bits/shared_ptr_atomic.h
@@ -392,6 +392,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   class _Sp_atomic
   {
     using value_type = _Tp;
+      using element_type = typename _Tp::element_type;

     friend struct atomic<_Tp>;

@@ -420,7 +421,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

        ~_Atomic_count()
        {
-         auto __val = _M_val.load(memory_order_relaxed);
+         auto __val = __atomic_load_n(&_M_val, int(memory_order_relaxed));


Would it be cleaner to use the __atomic_impl::xxx wrappers (or just
use std::atomic_ref) instead of the raw __atomic_xxx built-ins?

You'd still need to use __atomic_wait_address directly, but everything
else could use the nicer API, and not need to cast memory_order to
int.

          _GLIBCXX_TSAN_MUTEX_DESTROY(&_M_val);
          __glibcxx_assert(!(__val & _S_lock_bit));
          if (auto __pi = reinterpret_cast<pointer>(__val))


Unrelated to this patch, but I wonder if we should do:

 __val = __val & ~_S_lock_bit;

here, so that if the destructor _does_ get run while it's locked (and
assertions are disabled), we don't get a segfault or bus error for the
unaligned pointer access.

That would mean potentially decrementing the refcount while another
thread thinks it has it locked, which is still bad, and still a memory
safety issue.

The safest thing to do here would be to acquire the lock in the
destructor, but that would add overhead for correct programs.

@@ -442,21 +443,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
        {
          // To acquire the lock we flip the LSB from 0 to 1.

-         auto __current = _M_val.load(memory_order_relaxed);
+         auto __current = __atomic_load_n(&_M_val, int(memory_order_relaxed));
          while (__current & _S_lock_bit)
            {
#if __glibcxx_atomic_wait
              __detail::__thread_relax();
#endif
-             __current = _M_val.load(memory_order_relaxed);
+             __current = __atomic_load_n(&_M_val, int(memory_order_relaxed));
            }

          _GLIBCXX_TSAN_MUTEX_TRY_LOCK(&_M_val);

-         while (!_M_val.compare_exchange_strong(__current,
-                                                __current | _S_lock_bit,
-                                                __o,
-                                                memory_order_relaxed))
+         while (!__atomic_compare_exchange_n(
+                   &_M_val, &__current, __current | _S_lock_bit, 0,
+                   int(__o), int(memory_order_relaxed)))
            {
              _GLIBCXX_TSAN_MUTEX_TRY_LOCK_FAILED(&_M_val);
#if __glibcxx_atomic_wait
@@ -474,7 +474,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
        unlock(memory_order __o) const noexcept
        {
          _GLIBCXX_TSAN_MUTEX_PRE_UNLOCK(&_M_val);
-         _M_val.fetch_sub(1, __o);
+         __atomic_fetch_sub(&_M_val, 1, int(__o));
          _GLIBCXX_TSAN_MUTEX_POST_UNLOCK(&_M_val);
        }

@@ -487,7 +487,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
            __o = memory_order_release;
          auto __x = reinterpret_cast<uintptr_t>(__c._M_pi);
          _GLIBCXX_TSAN_MUTEX_PRE_UNLOCK(&_M_val);
-         __x = _M_val.exchange(__x, __o);
+         __x = __atomic_exchange_n(&_M_val, __x, int(__o));
          _GLIBCXX_TSAN_MUTEX_POST_UNLOCK(&_M_val);
          __c._M_pi = reinterpret_cast<pointer>(__x & ~_S_lock_bit);
        }
@@ -495,19 +495,45 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
#if __glibcxx_atomic_wait
        // Precondition: caller holds lock!
        void
-       _M_wait_unlock(memory_order __o) const noexcept
+       _M_wait_unlock(const element_type* const& __ptr, memory_order __o) 
const noexcept
        {
+         auto __old_ptr = __ptr;
          _GLIBCXX_TSAN_MUTEX_PRE_UNLOCK(&_M_val);
-         auto __v = _M_val.fetch_sub(1, memory_order_relaxed);
+         auto __old_pi
+           = __atomic_sub_fetch(&_M_val, 1, int(memory_order_relaxed));
          _GLIBCXX_TSAN_MUTEX_POST_UNLOCK(&_M_val);
-         _M_val.wait(__v & ~_S_lock_bit, __o);
+
+         // Ensure that the correct value of _M_ptr is visible after locking,
+         // by upgrading relaxed or consume to acquire.
+         auto __lo = __o;
+         if (__o != memory_order_seq_cst)
+           __lo = memory_order_acquire;


I noticed that if the user calls wait(memory_order::acq_rel) then this
will downgrade it to just acquire. But wait has a precondition that
the order is not release or acq_rel, so that's fine.

+
+         std::__atomic_wait_address(
+           &_M_val,
+           [=, &__ptr, this](uintptr_t __new_pi)
+             {
+               if (__old_pi != (__new_pi & ~_S_lock_bit))
+                 // control block changed, we can wake up
+                 return true;
+
+               // control block is same, we need to check if ptr changed,
+               // the lock needs to be taken first, the value of pi may have
+               // also been updated in meantime, so reload it
+               __new_pi = reinterpret_cast<uintptr_t>(this->lock(__lo));
+               auto __new_ptr = __ptr;
+               this->unlock(memory_order_relaxed);
+               // wake up if either of the values changed
+               return __new_pi != __old_pi || __new_ptr != __old_ptr;
+             },
+           [__o, this] { return __atomic_load_n(&_M_val, int(__o)); });
        }

        void
        notify_one() noexcept
        {
          _GLIBCXX_TSAN_MUTEX_PRE_SIGNAL(&_M_val);
-         _M_val.notify_one();
+         std::__atomic_notify_address(&_M_val, false);
          _GLIBCXX_TSAN_MUTEX_POST_SIGNAL(&_M_val);
        }

@@ -515,17 +541,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
        notify_all() noexcept
        {
          _GLIBCXX_TSAN_MUTEX_PRE_SIGNAL(&_M_val);
-         _M_val.notify_all();
+         std::__atomic_notify_address(&_M_val, true);
          _GLIBCXX_TSAN_MUTEX_POST_SIGNAL(&_M_val);
        }
#endif

     private:
-       mutable __atomic_base<uintptr_t> _M_val{0};
+       mutable uintptr_t _M_val{0};
        static constexpr uintptr_t _S_lock_bit{1};
     };

-      typename _Tp::element_type* _M_ptr = nullptr;
+      element_type* _M_ptr = nullptr;
     _Atomic_count _M_refcount;

     static typename _Atomic_count::pointer
@@ -608,7 +634,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
     {
        auto __pi = _M_refcount.lock(memory_order_acquire);
        if (_M_ptr == __old._M_ptr && __pi == __old._M_refcount._M_pi)
-         _M_refcount._M_wait_unlock(__o);
+         _M_refcount._M_wait_unlock(_M_ptr, __o);
        else
          _M_refcount.unlock(memory_order_relaxed);
     }
diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 5f5963cb595..51defbce78b 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -287,7 +287,7 @@ class SharedPointerPrinter(printer_base):
   def _get_refcounts(self):
       if self._typename == 'std::atomic':
           # A tagged pointer is stored as uintptr_t.
-            ptr_val = self._val['_M_refcount']['_M_val']['_M_i']
+            ptr_val = self._val['_M_refcount']['_M_val']


This will make the new printer unable to debug the old version of
std::atomic<std::shared_ptr<T>>, which can happen if somebody compiles
a C++20 TU with GCC 15 and links it to a C++17 TU compiled with GCC
16, and runs the resulting executable with the libstdc++.so (and GDB
printers) from GCC 16. That's a valid use case, which doesn't violate
the ODR by having two ABI-incompatible versions of
std::atomic<std::shared_ptr<T>> in the program (there's only one TU
that uses the type, and it uses the old definition).

The simplest way to handle that is probably:

       if self._typename == 'std::atomic':
           # A tagged pointer is stored as uintptr_t.
           val = self._val['_M_refcount']['_M_val']
           if val.type.is_scalar: # GCC 16 stores uintptr_t
               ptr_val = val
           else: # GCC 12-15 stores std::atomic<uintptr_t>
               ptr_val = val['_M_i']

That should work correctly for all versions.


The gdb.Type API (including the is_scalar member) is documented at:
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Types-In-Python.html

I also frequently refer to the gdb.Value docs:
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Values-From-Inferior.html

           ptr_val = ptr_val - (ptr_val % 2)  # clear lock bit
           ptr_type = find_type(self._val['_M_refcount'].type, 'pointer')
           return ptr_val.cast(ptr_type)
diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/atomic/pr118757.cc 
b/libstdc++-v3/testsuite/20_util/shared_ptr/atomic/pr118757.cc
new file mode 100644
index 00000000000..9c3dc01255c
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/atomic/pr118757.cc
@@ -0,0 +1,46 @@
+// Copyright (C) 2014-2025 Free Software Foundation, Inc.


I don't think this test is based on anything from 2014, is it?

And if you're contributing under the DCO not under the Red Hat
corporate assignment then this isn't copyright FSF anyway, so
shouldn't have any of this text.

+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++20 } }
+// { dg-require-gthreads "" }
+// { dg-require-effective-target hosted }
+
+#include <memory>
+#include <chrono>
+#include <thread>
+#include <barrier>
+
+std::shared_ptr<int> q = std::make_shared<int>(42);
+std::atomic<std::shared_ptr<int>> p = q;
+
+std::barrier bar(2);
+
+void signaler()


"signaller" with two Ls

Same comments for the weak_ptr test.

+{
+  std::this_thread::sleep_for(std::chrono::seconds(1));
+  p.store(std::shared_ptr<int>(q, nullptr));
+  p.notify_one();
+  bar.arrive_and_wait();
+}
+
+int main(int, char**)
+{
+  std::thread thr(signaler);
+  p.wait(q);
+  bar.arrive_and_wait();
+  thr.join();
+}
diff --git a/libstdc++-v3/testsuite/20_util/weak_ptr/pr118757.cc 
b/libstdc++-v3/testsuite/20_util/weak_ptr/pr118757.cc
new file mode 100644
index 00000000000..c1110241f09
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/weak_ptr/pr118757.cc
@@ -0,0 +1,47 @@
+// Copyright (C) 2014-2025 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++20 } }
+// { dg-require-gthreads "" }
+// { dg-require-effective-target hosted }
+
+#include <memory>
+#include <chrono>
+#include <thread>
+#include <barrier>
+
+std::shared_ptr<int> s = std::make_shared<int>(42);
+std::weak_ptr<int> q = s;
+std::atomic<std::weak_ptr<int>> p = q;
+
+std::barrier bar(2);
+
+void signaler()
+{
+  std::this_thread::sleep_for(std::chrono::seconds(1));
+  p.store(std::shared_ptr<int>(s, nullptr));
+  p.notify_one();
+  bar.arrive_and_wait();
+}
+
+int main(int, char**)
+{
+  std::thread thr(signaler);
+  p.wait(q);
+  bar.arrive_and_wait();
+  thr.join();
+}
--
2.50.1

Re: [PATCH v2] libstdc++: Make atomic>::wait sensitive to stored pointer only changes [PR118757]

Reply via email to