This change doubles the effective range of the std::shared_ptr and std::weak_ptr reference counts for most 64-bit targets.
The counter type, _Atomic_word, is usually a signed 32-bit int (except on Solaris v9 where it is a signed 64-bit long). The return type of std::shared_ptr::use_count() is long. For targets where long is wider than _Atomic_word (most 64-bit targets) we can treat the _Atomic_word reference counts as unsigned and allow them to wrap around from their most positive value to their most negative value without any problems. The logic that operates on the counts only cares if they are zero or non-zero, and never performs relational comparisons. The atomic fetch_add operations on integers are required by the standard to behave like unsigned types, so that overflow is well-defined: "the result is as if the object value and parameters were converted to their corresponding unsigned types, the computation performed on those types, and the result converted back to the signed type." So if we allow the counts to wrap around to negative values, all we need to do is cast the value to make_unsigned_t<_Atomic_word> before returning it as long from the use_count() function. In practice even exceeding INT_MAX is extremely unlikely, as it would require billions of shared_ptr or weak_ptr objects to have been constructed and never destroyed. However, if that happens we now have double the range before the count returns to zero and causes problems. Some of the member functions for the _Sp_counted_base<_S_single> specialization are adusted to use the __atomic_add and __exchange_and_add helpers instead of plain ++ and -- operations. This is done because those helpers use unsigned arithmetic, where the plain increments and decrements would have undefined behaviour on overflow. libstdc++-v3/ChangeLog: PR libstdc++/71945 * include/bits/shared_ptr_base.h (_Sp_counted_base::_M_get_use_count): Cast _M_use_count to unsigned before returning as long. (_Sp_counted_base<_S_single>::_M_add_ref_copy): Use atomic helper function to adjust ref count using unsigned arithmetic. (_Sp_counted_base<_S_single>::_M_weak_release): Likewise. (_Sp_counted_base<_S_single>::_M_get_use_count): Cast _M_use_count to unsigned before returning as long. (_Sp_counted_base<_S_single>::_M_add_ref_lock_nothrow): Use _M_add_ref_copy to do increment using unsigned arithmetic. (_Sp_counted_base<_S_single>::_M_release): Use atomic helper and _M_weak_release to do decrements using unsigned arithmetic. (_Sp_counted_base<_S_mutex>::_M_release): Add comment. (_Sp_counted_base<_S_single>::_M_weak_add_ref): Remove specialization. --- v2: This now only changes the ref counts to be treated as unsigned. Trapping on counter overflow is moved to PATCH 3/3. Tested powerpc64le-linux. libstdc++-v3/include/bits/shared_ptr_base.h | 70 ++++++++++++--------- 1 file changed, 40 insertions(+), 30 deletions(-) diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h index fb868e7afc36..2ea53ae15998 100644 --- a/libstdc++-v3/include/bits/shared_ptr_base.h +++ b/libstdc++-v3/include/bits/shared_ptr_base.h @@ -224,9 +224,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION long _M_get_use_count() const noexcept { + // If long is wider than _Atomic_word then we can treat _Atomic_word + // as unsigned, and so double its usable range. If the widths are the + // same then casting to unsigned and then to long is a no-op. + using _Up = typename make_unsigned<_Atomic_word>::type; + // No memory barrier is used here so there is no synchronization // with other threads. - return __atomic_load_n(&_M_use_count, __ATOMIC_RELAXED); + return (_Up) __atomic_load_n(&_M_use_count, __ATOMIC_RELAXED); } private: @@ -237,6 +242,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _Atomic_word _M_weak_count; // #weak + (#shared != 0) }; + // We use __atomic_add_single and __exchange_and_add_single in the _S_single + // member specializations because they use unsigned arithmetic and so avoid + // undefined overflow. + template<> + inline void + _Sp_counted_base<_S_single>::_M_add_ref_copy() + { __gnu_cxx::__atomic_add_single(&_M_use_count, 1); } + + template<> + inline void + _Sp_counted_base<_S_single>::_M_weak_release() noexcept + { + if (__gnu_cxx::__exchange_and_add_single(&_M_weak_count, -1) == 1) + _M_destroy(); + } + + template<> + inline long + _Sp_counted_base<_S_single>::_M_get_use_count() const noexcept + { + using _Up = typename make_unsigned<_Atomic_word>::type; + return (_Up) _M_use_count; + } + + template<> inline bool _Sp_counted_base<_S_single>:: @@ -244,7 +274,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { if (_M_use_count == 0) return false; - ++_M_use_count; + _M_add_ref_copy(); return true; } @@ -256,6 +286,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __gnu_cxx::__scoped_lock sentry(*this); if (__gnu_cxx::__exchange_and_add_dispatch(&_M_use_count, 1) == 0) { + // Count was zero, so we cannot lock it to get a shared_ptr. + // Reset to zero. This isn't racy, because there are no shared_ptr + // objects using this count and any other weak_ptr objects using it + // must call this function to modify _M_use_count, so would be + // synchronized by the mutex. _M_use_count = 0; return false; } @@ -282,20 +317,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return true; } - template<> - inline void - _Sp_counted_base<_S_single>::_M_add_ref_copy() - { ++_M_use_count; } - template<> inline void _Sp_counted_base<_S_single>::_M_release() noexcept { - if (--_M_use_count == 0) + if (__gnu_cxx::__exchange_and_add_single(&_M_use_count, -1) == 1) { - _M_dispose(); - if (--_M_weak_count == 0) - _M_destroy(); + _M_dispose(); + _M_weak_release(); } } @@ -364,25 +393,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #pragma GCC diagnostic pop } - template<> - inline void - _Sp_counted_base<_S_single>::_M_weak_add_ref() noexcept - { ++_M_weak_count; } - - template<> - inline void - _Sp_counted_base<_S_single>::_M_weak_release() noexcept - { - if (--_M_weak_count == 0) - _M_destroy(); - } - - template<> - inline long - _Sp_counted_base<_S_single>::_M_get_use_count() const noexcept - { return _M_use_count; } - - // Forward declarations. template<typename _Tp, _Lock_policy _Lp = __default_lock_policy> class __shared_ptr; -- 2.51.0