Compile the following code with options -march=armv7-a -mthumb -Os

struct S1
{
    int f1;
    int f2;
    int f3[6];
};

struct S2
{
    struct S1* pS1;
};

void aaaaaaaaa(struct S2* pS2, int count)
{
        int idx;
        for (idx = 0; idx < count; idx++) {
            struct S1* pS1 = &pS2->pS1[idx];
            foo(pS1->f1);
            pS1->f2 = 6;
        }
}

GCC generates:

aaaaaaaaa:
        push    {r4, r5, r6, r7, r8, lr}
        mov     r4, r0
        mov     r7, r1
        movs    r5, #0
        movs    r6, #6
        b       .L2
.L3:
        ldr     r2, [r4, #0]
        lsls    r3, r5, #5       // A
        adds    r5, r5, #1
        add     r8, r2, r3       // B
        ldr     r0, [r2, r3]     // C
        bl      foo
        str     r6, [r8, #4]
.L2:
        cmp     r5, r7
        blt     .L3
        pop     {r4, r5, r6, r7, r8, pc}

Instructions AB can be merged into one instruction and C should be modified
accordingly

      add r8, r2, r5 << 5
      ldr     r0, [r8]

The related rtl insns before fwprop2 pass is:

(insn 13 12 14 3 src/to.c:13 (set (reg:SI 143)
        (ashift:SI (reg/v:SI 135 [ idx ])
            (const_int 5 [0x5]))) 119 {*arm_shiftsi3} (nil))

(insn 15 14 16 3 src/to.c:17 (set (reg/v/f:SI 137 [ pS1 ])
        (plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
            (reg:SI 143))) 4 {*arm_addsi3} (expr_list:REG_DEAD (reg/f:SI 144 [
pS2_4(D)->pS1 ])
        (expr_list:REG_DEAD (reg:SI 143)
            (nil))))

(insn 16 15 17 3 src/to.c:18 (set (reg:SI 0 r0)
        (mem/s:SI (reg/v/f:SI 137 [ pS1 ]) [5 pS1_8->f1+0 S4 A32])) 661
{*thumb2_movsi_insn} (nil))

It looks can be handled by combine pass. But the fwprop2 pass propagates the
following expression into memory load

    (plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
            (reg:SI 143))

So now we get:

(insn 13 12 14 3 src/to.c:13 (set (reg:SI 143)
        (ashift:SI (reg/v:SI 135 [ idx ])
            (const_int 5 [0x5]))) 119 {*arm_shiftsi3} (nil))

(insn 15 14 16 3 src/to.c:17 (set (reg/v/f:SI 137 [ pS1 ])
        (plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
            (reg:SI 143))) 4 {*arm_addsi3} (expr_list:REG_DEAD (reg/f:SI 144 [
pS2_4(D)->pS1 ])
        (expr_list:REG_DEAD (reg:SI 143)
            (nil))))

(insn 16 15 17 3 src/to.c:18 (set (reg:SI 0 r0)
        (mem/s:SI (plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
                (reg:SI 143)) [5 pS1_8->f1+0 S4 A32])) 661 {*thumb2_movsi_insn}
(nil))

Now r143 is used in both insn 15 and insn 16. Combine insn 13 and insn 15 can't
bring any benefit.

So in function fwprop_addr before deciding propagate an expression should we
also check if it is the only use of the corresponding def?


-- 
           Summary: Combine separate shift and add instructions into a
                    single one
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: carrot at google dot com
 GCC build triplet: i686-linux
  GCC host triplet: i686-linux
GCC target triplet: arm-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44883

Reply via email to