https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94622
--- Comment #1 from acsawdey at gcc dot gnu.org --- Compiling with -dap we see: sync # 7 [c=12 l=4] *hwsync plq 8,.LANCHOR0@pcrel # 8 [c=8 l=12] load_quadpti mr 10,9 # 9 [c=4 l=4] *movdi_internal64/2 mr 11,8 # 10 [c=4 l=4] *movdi_internal64/2 I think the problem is that atomic_load<mode> thinks it always needs to do a doubleword swap if little endian for TImode, which is true for lq, but not for plq.