https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102566
Marko Mäkelä <marko.makela at mariadb dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |marko.makela at mariadb dot com --- Comment #31 from Marko Mäkelä <marko.makela at mariadb dot com> --- Much of this seems to work in GCC 12.2.0 as well as in clang++-15. For clang there is a related ticket https://github.com/llvm/llvm-project/issues/37322 I noticed a missed optimization in both g++-12 and clang++-15: Some operations involving bit 31 degrade to loops around lock cmpxchg. I compiled it with "-c -O2" (AMD64) or "-c -O2 -m32 -march=i686" (IA-32). #include <atomic> template<uint32_t b> void lock_bts(std::atomic<uint32_t> &a) { while (!(a.fetch_or(b) & b)); } template<uint32_t b> void lock_btr(std::atomic<uint32_t> &a) { while (a.fetch_and(~b) & b); } template<uint32_t b> void lock_btc(std::atomic<uint32_t> &a) { while (a.fetch_xor(b) & b); } template void lock_bts<1U<<30>(std::atomic<uint32_t> &a); template void lock_btr<1U<<30>(std::atomic<uint32_t> &a); template void lock_btc<1U<<30>(std::atomic<uint32_t> &a); // bug: uses lock cmpxchg template void lock_bts<1U<<31>(std::atomic<uint32_t> &a); template void lock_btr<1U<<31>(std::atomic<uint32_t> &a); template void lock_btc<1U<<31>(std::atomic<uint32_t> &a);