https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91523

            Bug ID: 91523
           Summary: Register allocation picks sub-optimal alternative with
                    scratch registers
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, ra
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rearnsha at gcc dot gnu.org
                CC: vmakarov at redhat dot com, wdijkstr at arm dot com
  Target Milestone: ---
            Target: arm-none-eabi

consider this simple testcase:
_Bool f (unsigned a, unsigned b) { return a | b; }

when compiled for arm-eabi with options -O2 -march=armv7-a -mthumb, the
compiler generates the following code sequence:

        orrs    r3, r0, r1      @ 8     [c=4 l=4]  *iorsi3_compare0_scratch/2
        ite     ne
        movne   r0, #1  @ 24    [c=8 l=6]  *p *thumb2_movsi_insn/1
        moveq   r0, #0  @ 25    [c=8 l=6]  *p *thumb2_movsi_insn/1
        bx      lr      @ 28    [c=8 l=4]  *thumb2_return

which is correct, but sub-optimal.  The pattern selected for the orrs has
picked a sub-optimal alternative.

The pattern being matched is

(define_insn "*iorsi3_compare0_scratch"
  [(set (reg:CC_NOOV CC_REGNUM)
        (compare:CC_NOOV
         (ior:SI (match_operand:SI 1 "s_register_operand" "%r,0,r")
                 (match_operand:SI 2 "arm_rhs_operand" "I,l,r"))
         (const_int 0)))
   (clobber (match_scratch:SI 0 "=r,l,r"))]
  "TARGET_32BIT"
  "orrs%?\\t%0, %1, %2"
  [(set_attr "conds" "set")
   (set_attr "arch" "*,t2,*")
   (set_attr "length" "4,2,4")
   (set_attr "type" "logics_imm,logics_reg,logics_reg")]
)

and the insn matching it is

(insn 7 21 22 2 (parallel [
            (set (reg:CC_NOOV 100 cc)
                (compare:CC_NOOV (ior:SI (reg:SI 119)
                        (reg:SI 120))
                    (const_int 0 [0])))
            (clobber (scratch:SI))
        ]) "/home/rearnsha/work/pdtools/gcc-tests/cmpdi.c":3:35 96
{*iorsi3_compare0_scratch}
     (expr_list:REG_DEAD (reg:SI 120)
        (expr_list:REG_DEAD (reg:SI 119)
            (nil))))

Given that both input operands are dead, there is no reason why the compiler
can't pick alternative 1 from the insn and use a scratch that clobbers one of
the inputs (either directly, or via a commutative swap).  However it insists on
allocating a different register (r3) and then using alternative 2 which is more
expensive (32-bit insn instead of 16-bit insn).

Note the pattern shown above is updated in r274822 to add the thumb2
alternative that we want to match here.

Reply via email to