https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71022

            Bug ID: 71022
           Summary: GCC prefers register moves over move immediate
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wdijkstr at arm dot com
  Target Milestone: ---

When assigning the same immediate value to different registers, GCC will always
CSE the immediate and emit a register move for subsequent uses. This creates
unnecessary register dependencies and increases latency. When the cost of an
immediate move is the same as a register move (which should be true for most
targets), it should prefer the former. A register move is better only when the
immediate requires multiple instructions or is larger with -Os.

It's not obvious where this is best done. The various cprop phases before IRA
do the right thing, but cse2 (which runs later) then undoes it. And
cprop_hardreg doesn't appear to be able to deal with immediates.

int f1(int x)
{
  int y = 1, z = 1;
  while (x--)
    {
      y += z;
      z += x;
    }
  return y + z;
}

void g(float, float);
void f2(void) { g(1.0, 1.0); g(3.3, 3.3); }

On AArch64 I get:

f1:
        sub     w1, w0, #1
        cbz     w0, .L12
        mov     w0, 1
        mov     w2, w0     *** mov w2, 1
        .p2align 2
.L11:
        add     w2, w2, w0
        add     w0, w0, w1
        sub     w1, w1, #1
        cmn     w1, #1
        bne     .L11
        add     w0, w2, w0
        ret
.L12:
        mov     w0, 2
        ret

f2:
        fmov    s1, 1.0e+0
        str     x30, [sp, -16]!
        fmov    s0, s1    *** fmov s0, 1.0
        bl      g
        adrp    x0, .LC1
        ldr     x30, [sp], 16
        ldr     s1, [x0, #:lo12:.LC1]
        fmov    s0, s1    *** ldr s0, [x0, #:lo12:.LC1] 
        b       g

Reply via email to