https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88097
Bug ID: 88097 Summary: Missing optimization of endian conversion Product: gcc Version: 8.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: bugzi...@poradnik-webmastera.com Target Milestone: --- I have found some old code network code which looked like this: [code] #include <stdint.h> #include <arpa/inet.h> struct Test { uint16_t Word1; uint16_t Word2; }; uint32_t test(Test* ip) { return ((ntohs(ip->Word1) << 16) | ntohs(ip->Word2)); } [/code] gcc 8.2 compiles it in following way (with -O3): [asm] test(Test*): movzx eax, WORD PTR [rdi] movzx edx, WORD PTR [rdi+2] rorw $8, ax rorw $8, dx sal eax, 16 movzx edx, dx or eax, edx ret [/asm] clang 7.0.0 recognizes that both 16-bit fields are next to each other, so 32-bit byte swap can be used: [asm] test(Test*): # @test(Test*) mov eax, dword ptr [rdi] bswap eax ret [/asm] And this is with -mmovbe added: [asm] test(Test*): # @test(Test*) movbe eax, dword ptr [rdi] ret [/asm]