https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119796
Bug ID: 119796 Summary: Atomic Operations Can Deadlock Without Hardware Support Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: alpha.and.omega.programmer at gmail dot com Target Milestone: --- Created attachment 61107 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61107&action=edit Code that demonstrates deadlock This is actually a problem in libatomic and seems to affect the atomics implementation of both C and C++. I haven't tested other languages because I don't know other languages, but I suspect any language using libatomics is also impacted. When using any functionality from std::atomic without hardware atomic support, atomic operations are simulated at the software level. In particular, the current locking step of atomic operations is prone to deadlocks. The problem with the current implementation is that it relies on a hash of the lower 12 bits of the pointer address to index an array of mutexes, and because the index can wrap around the array, 2 threads trying to perform an atomic operation on data requiring more than half of the locks in the array will have potential overlap in the range of locks they attempt to acquire. Because of the wrap around, the competing threads may start locking threads at the end of the other thread's range, which causes a deadlock since neither thread would be able to acquire all of the locks it needs. Any locking scheme that attempts to be performant by using multiple mutexes can not be threadsafe because of race conditions, so the locking implementation must be protected by a single mutex controlling access to these fine-grained locks. However, the atomic operation that is being performed is likely to be much faster than trying to lock and unlock multiple mutexes based on some algorithm, so a simple global mutex to only allow a single atomic operation at a time would be ideal both for performance as well as safety.