https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105708
Bug ID: 105708 Summary: libgcc: aarch64: init_lse_atomics can race with user-defined constructors Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libgcc Assignee: unassigned at gcc dot gnu.org Reporter: keno at juliacomputing dot com Target Milestone: --- Recent gcc versions provide the `-moutline-atomics` option that outlines aarch64 atomics into calls to libgcc that dispatch to either lse atomics or legacy ll/sc atomics depending on the availability of the feature on the target platform. This is useful for performance (since lse atomics have better performance characteristics), but also for projects like the rr (https://rr-project.org/) userspace record-and-replay debugger, which emulates an aarch64 machine without ll/sc intrinsics (because ll/sc introduces non-deterministic control flow divergences that rr cannot record). The feature detection uses the following function in libgcc (config/aarch64/lse-init.c): ``` static void __attribute__((constructor)) init_have_lse_atomics (void) { unsigned long hwcap = __getauxval (AT_HWCAP); __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0; } ``` Unfortunately, the order of this `init_have_lse_atomics` is not defined with respect to other uses of `__attribute__((constructor))`. As a result, other constructors using atomics may end up using ll/sc instructions, even if lse atomics are supported on the target (usually only a performance penalty, but as mentioned above, a significant concern for projects like rr). Worse, the initialization order can change, with minor changes in the environment. E.g. recent binary builds of debian testing initialize lse too late in libpthread, breaking rr. Earlier builds had the opposite initialization order allowing rr to work without issue. I can see two possibilities to introduce more determinism here: 1. Use `__attribute__((constructor(100)))` (100 being the system library priority used e.g. in libstdc++ as well) for `init_have_lse_atomics`, forcing a deterministic initialization order wrt user (or libc)-defined constructors. There are still ordering concerns wrt libstdc++ which also uses init_order 100, but as far as I can tell does not use atomics in these constructors. If this changes, the priority here could be further reduced in future iterations of libgcc. 2. Switch the outlined atomics to probe lse support on first use rather than using a constructor. If this is possible, I think it would be preferable to avoid any possibility of initialization order problems, but I understand that there may be code size and performance concerns.