https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111589
--- Comment #3 from Jonathan Wakely <redi at gcc dot gnu.org> --- (In reply to Jonathan Wakely from comment #2) > The interesting question is whether all of these can be relaxed or if we > need to stop using __atomic_add_dispatch for shared_ptr copies: > > include/bits/cow_string.h: > __gnu_cxx::__atomic_add_dispatch(&this->_M_refcount, 1); This is basic_string::_M_add_ref_copy() which increases the refcount on a COW string. I think this can be relaxed. We only need to synchronize with decrements of that refcount, and any change to the existing string that causes a decrement needs to be synchronized explicitly with the copy anyway. The refcount increment isn't such a synchronization operation, so can be relaxed. > include/bits/cow_string.h: > __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1); This is the COW string move constructor when --enable-fully-dynamic-string is used to configure libstdc++. This can be relaxed for the same reasons. > include/bits/ios_base.h: _M_add_reference() { > __gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); } This is increases the refcount of callbacks in ios::copyfmt. This can be relaxed. We only need to synchrnonize with decrements, which only happen in the ~ios destructor, which cannot be potentially concurrent with copying the ios object. > include/bits/locale_classes.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); } locale::facet::_M_add_reference(). Can be relaxed, because any remove_reference call cannot happen concurrently on the same object, so doesn't need to synchronize with the increment. > include/bits/locale_classes.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_refcount, 1); } locale::id::_M_add_reference(). Same again. > include/bits/shared_ptr_base.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); } That's the subject of this PR. > include/bits/shared_ptr_base.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); } I think the same logic applies to this as well. > include/ext/atomicity.h: // __atomic_add_dispatch > include/ext/atomicity.h: __atomic_add_dispatch(_Atomic_word* __mem, int > __val) That's the actual function we're talking about changing. > include/ext/pool_allocator.h: __atomic_add_dispatch(&_S_force_new, > 1); > include/ext/pool_allocator.h: __atomic_add_dispatch(&_S_force_new, > -1); This just looks like a data race, without changing anything. The getenv call can probably be considered a synchronization point, and I don't think we need to synchronize with other threads here anyway. We should consider an relaxed atomic load for _S_force_new as well, to avoid the race. In any case, it can be a relaxed inc/dec. > include/ext/rc_string_base.h: > __atomic_add_dispatch(&_M_info._M_refcount, 1); Equivalent to the COW string _M_refcopy(). > include/tr1/shared_ptr.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_use_count, 1); } > include/tr1/shared_ptr.h: { > __gnu_cxx::__atomic_add_dispatch(&_M_weak_count, 1); } Equivalent to std::shared_ptr and std::weak_ptr. > libsupc++/eh_atomics.h: __gnu_cxx::__atomic_add_dispatch (__count, 1); This looks like it needs acq_rel. We could use __exchange_and_add to get acq_rel ordering. > src/c++98/ios_init.cc: __gnu_cxx::__atomic_add_dispatch(&_S_refcount, 1); I think there's a race here, independent of the ordering used for this increment.