https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105708

            Bug ID: 105708
           Summary: libgcc: aarch64: init_lse_atomics can race with
                    user-defined constructors
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgcc
          Assignee: unassigned at gcc dot gnu.org
          Reporter: keno at juliacomputing dot com
  Target Milestone: ---

Recent gcc versions provide the `-moutline-atomics` option that outlines
aarch64 atomics into calls to libgcc that dispatch to either lse atomics or
legacy ll/sc atomics depending on the availability of the feature on the target
platform.

This is useful for performance (since lse atomics have better performance
characteristics), but also for projects like the rr (https://rr-project.org/)
userspace record-and-replay debugger, which emulates an aarch64 machine without
ll/sc intrinsics (because ll/sc introduces non-deterministic control flow
divergences that rr cannot record).

The feature detection uses the following function in libgcc
(config/aarch64/lse-init.c):
```
static void __attribute__((constructor))
init_have_lse_atomics (void)
{
  unsigned long hwcap = __getauxval (AT_HWCAP);
  __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0;
}
```

Unfortunately, the order of this `init_have_lse_atomics` is not defined with
respect to other uses of `__attribute__((constructor))`. As a result, other
constructors using atomics may end up using ll/sc instructions, even if lse
atomics are supported on the target (usually only a performance penalty, but as
mentioned above, a significant concern for projects like rr).

Worse, the initialization order can change, with minor changes in the
environment. E.g. recent binary builds of debian testing initialize lse too
late in libpthread, breaking rr. Earlier builds had the opposite initialization
order allowing rr to work without issue.

I can see two possibilities to introduce more determinism here:
1. Use `__attribute__((constructor(100)))` (100 being the system library
priority used e.g. in libstdc++ as well) for `init_have_lse_atomics`, forcing a
deterministic initialization order wrt user (or libc)-defined constructors.
There are still ordering concerns wrt libstdc++ which also uses init_order 100,
but as far as I can tell does not use atomics in these constructors. If this
changes, the priority here could be further reduced in future iterations of
libgcc.

2. Switch the outlined atomics to probe lse support on first use rather than
using a constructor. If this is possible, I think it would be preferable to
avoid any possibility of initialization order problems, but I understand that
there may be code size and performance concerns.

Reply via email to