https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121266
--- Comment #4 from Kang-Che Sung <Explorer09 at gmail dot com> --- Sounds like a microcode issue in processors, no? I remember that `xor eax, eax`, a common pattern of setting eax value to 0, doesn't create a dependency on eax as a special case during decode. It's a pity that this little `or -1, eax` optimization is gone (except in '-Oz' mode). Note that gcc 15.1 uses two instructions `push -1 pop rax` for this. Same in Clang 20.1.0 according to my testing.