Compile the following code with options -march=armv7-a -mthumb -Os
struct S1
{
int f1;
int f2;
int f3[6];
};
struct S2
{
struct S1* pS1;
};
void aaaaaaaaa(struct S2* pS2, int count)
{
int idx;
for (idx = 0; idx < count; idx++) {
struct S1* pS1 = &pS2->pS1[idx];
foo(pS1->f1);
pS1->f2 = 6;
}
}
GCC generates:
aaaaaaaaa:
push {r4, r5, r6, r7, r8, lr}
mov r4, r0
mov r7, r1
movs r5, #0
movs r6, #6
b .L2
.L3:
ldr r2, [r4, #0]
lsls r3, r5, #5 // A
adds r5, r5, #1
add r8, r2, r3 // B
ldr r0, [r2, r3] // C
bl foo
str r6, [r8, #4]
.L2:
cmp r5, r7
blt .L3
pop {r4, r5, r6, r7, r8, pc}
Instructions AB can be merged into one instruction and C should be modified
accordingly
add r8, r2, r5 << 5
ldr r0, [r8]
The related rtl insns before fwprop2 pass is:
(insn 13 12 14 3 src/to.c:13 (set (reg:SI 143)
(ashift:SI (reg/v:SI 135 [ idx ])
(const_int 5 [0x5]))) 119 {*arm_shiftsi3} (nil))
(insn 15 14 16 3 src/to.c:17 (set (reg/v/f:SI 137 [ pS1 ])
(plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
(reg:SI 143))) 4 {*arm_addsi3} (expr_list:REG_DEAD (reg/f:SI 144 [
pS2_4(D)->pS1 ])
(expr_list:REG_DEAD (reg:SI 143)
(nil))))
(insn 16 15 17 3 src/to.c:18 (set (reg:SI 0 r0)
(mem/s:SI (reg/v/f:SI 137 [ pS1 ]) [5 pS1_8->f1+0 S4 A32])) 661
{*thumb2_movsi_insn} (nil))
It looks can be handled by combine pass. But the fwprop2 pass propagates the
following expression into memory load
(plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
(reg:SI 143))
So now we get:
(insn 13 12 14 3 src/to.c:13 (set (reg:SI 143)
(ashift:SI (reg/v:SI 135 [ idx ])
(const_int 5 [0x5]))) 119 {*arm_shiftsi3} (nil))
(insn 15 14 16 3 src/to.c:17 (set (reg/v/f:SI 137 [ pS1 ])
(plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
(reg:SI 143))) 4 {*arm_addsi3} (expr_list:REG_DEAD (reg/f:SI 144 [
pS2_4(D)->pS1 ])
(expr_list:REG_DEAD (reg:SI 143)
(nil))))
(insn 16 15 17 3 src/to.c:18 (set (reg:SI 0 r0)
(mem/s:SI (plus:SI (reg/f:SI 144 [ pS2_4(D)->pS1 ])
(reg:SI 143)) [5 pS1_8->f1+0 S4 A32])) 661 {*thumb2_movsi_insn}
(nil))
Now r143 is used in both insn 15 and insn 16. Combine insn 13 and insn 15 can't
bring any benefit.
So in function fwprop_addr before deciding propagate an expression should we
also check if it is the only use of the corresponding def?
--
Summary: Combine separate shift and add instructions into a
single one
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: carrot at google dot com
GCC build triplet: i686-linux
GCC host triplet: i686-linux
GCC target triplet: arm-eabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44883