http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57690

            Bug ID: 57690
           Summary: bextr sometimes used instead of shr
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org

unsigned int bar (void);
unsigned long foo (unsigned int x) { return bar () >> 2; }

With -O2 -mtbm we get:
   0:    48 83 ec 08              sub    $0x8,%rsp
   4:    e8 00 00 00 00           callq  9 <foo+0x9>
            5: R_X86_64_PC32    bar-0x4
   9:    48 83 c4 08              add    $0x8,%rsp
   d:    8f ea f8 10 c0 02 1e     bextr  $0x1e02,%rax,%rax
  14:    00 00 
  16:    c3                       retq   
while without it:
   0:    48 83 ec 08              sub    $0x8,%rsp
   4:    e8 00 00 00 00           callq  9 <foo+0x9>
            5: R_X86_64_PC32    bar-0x4
   9:    48 83 c4 08              add    $0x8,%rsp
   d:    c1 e8 02                 shr    $0x2,%eax
  10:    c3                       retq   
which is much shorter.  On the other side, bextr with immediate gives more
freedom to the register allocator, because it is a non-destructive source
instruction.  So, perhaps we want a peephole2 which will transform some forms
of the immediate TARGET_TBM tbm_bextr* (those where upper bits of a SImode or
DImode value are extracted and where destination is the same as source) into
shrl.
For -Os maybe it would be even shorter to emit movl + shrl.

Reply via email to