https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96034
sshannin at gmail dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #2 from sshannin at gmail dot com --- Ah, yes, you're correct on both counts. For future reference if anybody comes across this, I can confirm on both a sandy bridge and skylake that the pxor does actually make it faster. I should've checked first; I got too excited by "fewer instructions = better". As far as the ABI, I'm certainly not an expert and if you claim that the upper bits are undefined I certainly defer to you. As you intuited, I was checking against llvm output (and it does omit the sign extend). Sorry for the bother and thanks for such a helpful response.