https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #17) > Hm, even with the latest patch, the testcase from comment #5: > still compiles to: > > movl %esi, %eax > movw %si, (%rdi) > notl %esi > notl %eax > movb %sil, 3(%rdi) > movb %ah, 2(%rdi) > ret The reason for that is that the IL is something the bswap framework can't handle. Let's look just at the simplified: void baz (char *buf, unsigned int data) { buf[2] = ~data >> 8; buf[3] = ~data; } _1 = ~data_6(D); _2 = _1 >> 8; _3 = (char) _2; MEM[(char *)buf_7(D) + 2B] = _3; _4 = (char) data_6(D); _5 = ~_4; MEM[(char *)buf_7(D) + 3B] = _5; If it was instead: _1 = ~data_6(D); _2 = _1 >> 8; _3 = (char) _2; MEM[(char *)buf_7(D) + 2B] = _3; _4 = (char) _1; MEM[(char *)buf_7(D) + 3B] = _4; then it would handle that. So I think it is a missed optimization in FRE or whatever else does SCCVN, or something match.pd should handle. As for: > void baz (char *buf, unsigned int data) > { > buf[0] = data >> 8; > buf[1] = data; > } not using movbew, that is something that should be done in the backend. For the middle-end, we don't have bswap16 and consider {L,R}ROTATE_EXPR by 8 as the canonical 16-bit byte swap. Please also have a look: unsigned short baz (unsigned short *buf) { unsigned short a = buf[0]; return ((unsigned short) (a >> 8)) | (unsigned short) (a << 8); } where we could also emit movbew instead of movw + rolw (if it is actually a win). Thus, I think i386.md should provide patterns for combine (or peephole2 if the former doesn't work for some reason) for this.