https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95228
Bug ID: 95228 Summary: Failure to optimize register allocation around atomic loads/stores Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- int x; int y; int f() { int ret; __atomic_load(&y, &ret, 0); int val = 0; __atomic_store(&y, &val, 5); return ret; } With -O3, GCC outputs this : f(): mov r8d, DWORD PTR y[rip] xor eax, eax xchg eax, DWORD PTR y[rip] mov eax, r8d ret LLVM outputs this : f(): # @f() mov eax, dword ptr [rip + .Ly$local] xor ecx, ecx xchg dword ptr [rip + .Ly$local], ecx ret eax can be replaced with ecx in the store to avoid having to store the load of y into a seperate register.