https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56538
--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> --- Should be fixed with glibc 2.32 by checking __libc_single_threaded before using atomics in shared_ptr. I don't understand why the original example would use atomics anyway though, __gthread_once should have been false. I think what this example is showing is not that the atomics are expensive, but that the compiler can optimise away the dispatching code if both branches use the same instructions.