https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119796

            Bug ID: 119796
           Summary: Atomic Operations Can Deadlock Without Hardware
                    Support
           Product: gcc
           Version: 14.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alpha.and.omega.programmer at gmail dot com
  Target Milestone: ---

Created attachment 61107
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=61107&action=edit
Code that demonstrates deadlock

This is actually a problem in libatomic and seems to affect the atomics
implementation of both C and C++. I haven't tested other languages because I
don't know other languages, but I suspect any language using libatomics is also
impacted.

When using any functionality from std::atomic without hardware atomic support,
atomic operations are simulated at the software level. In particular, the
current locking step of atomic operations is prone to deadlocks.

The problem with the current implementation is that it relies on a hash of the
lower 12 bits of the pointer address to index an array of mutexes, and because
the index can wrap around the array, 2 threads trying to perform an atomic
operation on data requiring more than half of the locks in the array will have
potential overlap in the range of locks they attempt to acquire. Because of the
wrap around, the competing threads may start locking threads at the end of the
other thread's range, which causes a deadlock since neither thread would be
able to acquire all of the locks it needs.

Any locking scheme that attempts to be performant by using multiple mutexes can
not be threadsafe because of race conditions, so the locking implementation
must be protected by a single mutex controlling access to these fine-grained
locks. However, the atomic operation that is being performed is likely to be
much faster than trying to lock and unlock multiple mutexes based on some
algorithm, so a simple global mutex to only allow a single atomic operation at
a time would be ideal both for performance as well as safety.

Reply via email to