https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71847
Bug ID: 71847 Summary: powerpc64le: Potential rlwinm optimisation Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: anton at samba dot org Target Milestone: --- The following code: unsigned long foo(unsigned long x) { unsigned long y = (x & 0xffffffffUL); return y | (y << 32); } Results in 3 instructions on gcc: foo: rldicl 3,3,0,32 sldi 9,3,32 or 3,9,3 blr Clang does it in 1, knowing that rotate32 starts with two 32 bit copies of the data: foo: rlwinm 3, 3, 0, 1, 0 blr