On ARM, the following C code: void whatever(const char *pqp) { volatile unsigned int *uart_thr = (typeof(uart_thr))0xE000C000; unsigned int ch; while((ch = *pqp++)) *uart_thr = ch; }
Generates this assembler output (by means of -mcpu=arm7tdmi -O2): whatever: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. ldrb r2, [r0, #0] @ zero_extendqisi2 cmp r2, #0 @ lr needed for prologue bxeq lr .L4: mov r3, #-536870912 add r3, r3, #49152 str r2, [r3, #0] ldrb r2, [r0, #1]! @ zero_extendqisi2 cmp r2, #0 bne .L4 bx lr The relevant part is the bne .L4 ; since r3 is preserved across the loop, it could optimize for speed without space penality by generating this instead: .L4: mov r3, #-536870912 add r3, r3, #49152 .L5: str r2, [r3, #0] ldrb r2, [r0, #1]! @ zero_extendqisi2 cmp r2, #0 bne .L5 bx lr ... or, in other words, generating the constant only once, which saves at least two cycles per iteration. -- Summary: ARM: Constant generation inside a loop: Missed optimization opportunity Product: gcc Version: 4.2.3 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: alexandre dot nunes at gmail dot com GCC host triplet: i686-unknow-linux GCC target triplet: arm-*-elf http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35141