http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57690
Bug ID: 57690 Summary: bextr sometimes used instead of shr Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org unsigned int bar (void); unsigned long foo (unsigned int x) { return bar () >> 2; } With -O2 -mtbm we get: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PC32 bar-0x4 9: 48 83 c4 08 add $0x8,%rsp d: 8f ea f8 10 c0 02 1e bextr $0x1e02,%rax,%rax 14: 00 00 16: c3 retq while without it: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <foo+0x9> 5: R_X86_64_PC32 bar-0x4 9: 48 83 c4 08 add $0x8,%rsp d: c1 e8 02 shr $0x2,%eax 10: c3 retq which is much shorter. On the other side, bextr with immediate gives more freedom to the register allocator, because it is a non-destructive source instruction. So, perhaps we want a peephole2 which will transform some forms of the immediate TARGET_TBM tbm_bextr* (those where upper bits of a SImode or DImode value are extracted and where destination is the same as source) into shrl. For -Os maybe it would be even shorter to emit movl + shrl.