Compile the following code with options -march=armv7-a -mthumb -Os struct S{ int f1; int reserved[3]; };
void ts() { struct S map; map.f1 = 0; foo(&map); } GCC 4.6 generates: ts: push {r0, r1, r2, r3, r4, lr} add r0, sp, #16 // A movs r3, #0 str r3, [r0, #-16]! // B mov r0, sp // C bl foo add sp, sp, #20 pop {pc} After instruction B, register r0 already contains the value of sp, so instruction C is not required, as shown in following. ts: push {r0, r1, r2, r3, r4, lr} add r0, sp, #16 movs r3, #0 str r3, [r0, #-16]! bl foo add sp, sp, #20 pop {pc} The RTL insns before IRA (insn 12 5 6 2 (set (reg/f:SI 134) (reg/f:SI 25 sfp)) src/ts.c:9 694 {*thumb2_movsi_insn} (nil)) (insn 6 12 8 2 (set (mem/s/c:SI (pre_modify:SI (reg/f:SI 134) (plus:SI (reg/f:SI 134) (const_int -16 [0xfffffffffffffff0]))) [3 map.f1+0 S4 A64]) (reg:SI 133)) src/ts.c:9 694 {*thumb2_movsi_insn} (expr_list:REG_DEAD (reg:SI 133) (expr_list:REG_INC (reg/f:SI 134) (expr_list:REG_EQUAL (const_int 0 [0]) (nil))))) (insn 8 6 9 2 (set (reg:SI 0 r0) (reg/f:SI 134)) src/ts.c:10 694 {*thumb2_movsi_insn} (expr_list:REG_DEAD (reg/f:SI 134) (expr_list:REG_EQUAL (plus:SI (reg/f:SI 25 sfp) (const_int -16 [0xfffffffffffffff0])) (nil)))) It shows the address register in insn 6 can be used in insn 8 directly. At RA stage, physical register r0 is assigned to pseudo register r134, so insn 8 should be mov r0, r0 which should be removed in later pass. But gcc also finds out from note that r134 is equal to (sfp - 16) which is equal to sp at the same time. So it generates mov r0, sp There is even better result: ts: push {r0, r1, r2, r3, r4, lr} movs r3, #0 str r3, [sp] mov r0, sp bl foo add sp, sp, #20 pop {pc} It contains same number of instructions, but the instructions are simpler and shorter. Actually the IL was in this form after expand (insn 5 2 6 2 (set (reg:SI 133) (const_int 0 [0])) src/ts.c:9 694 {*thumb2_movsi_insn} (nil)) (insn 6 5 7 2 (set (mem/s/c:SI (plus:SI (reg/f:SI 25 sfp) (const_int -16 [0xfffffffffffffff0])) [3 map.f1+0 S4 A64]) (reg:SI 133)) src/ts.c:9 694 {*thumb2_movsi_insn} (expr_list:REG_DEAD (reg:SI 133) (expr_list:REG_EQUAL (const_int 0 [0]) (nil)))) (insn 7 6 8 2 (set (reg/f:SI 134) (plus:SI (reg/f:SI 25 sfp) (const_int -16 [0xfffffffffffffff0]))) src/ts.c:10 4 {*arm_addsi3} (nil)) (insn 8 7 9 2 (set (reg:SI 0 r0) (reg/f:SI 134)) src/ts.c:10 694 {*thumb2_movsi_insn} (expr_list:REG_DEAD (reg/f:SI 134) (expr_list:REG_EQUAL (plus:SI (reg/f:SI 25 sfp) (const_int -16 [0xfffffffffffffff0])) (nil)))) After pass auto_inc_dec, (sfp - 16) is identified as an opportunity for auto_inc_dec optimization. But it doesn't bring any benefit for this case, and causes more complex instructions. -- Summary: unnecessary register move Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: carrot at google dot com GCC build triplet: i686-linux GCC host triplet: i686-linux GCC target triplet: arm-eabi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45252