https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99520
--- Comment #4 from ktkachov at gcc dot gnu.org --- (In reply to Jakub Jelinek from comment #3) > Consider e.g. > unsigned foo (unsigned x) > { > return (x<<24) + ((x<<8)&0xff0000) + ((x>>8)&0xff00) + (x>>24) + > (((x&0xff00)<<16)>>8); > } > as example that should not be optimized into __builtin_bswap32 (but should > be with | instead of +). interesting. That said, clang does do a better job than GCC on that too: foo(unsigned int): // @foo(unsigned int) lsl w9, w0, #8 rev w8, w0 and w9, w9, #0xff0000 add w0, w9, w8 ret