https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63206
Bug ID: 63206 Summary: Gcc 4.9.1 Generated code needlessly stacks r3 Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: alexandre.nunes at gmail dot com Created attachment 33459 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33459&action=edit testcase when compiling the following testcase code: #include <stdint.h> #include <string.h> static uint16_t lb[32]; void copy(uint16_t buffer[32]) { register uint32_t ret asm("r3"); asm volatile ("mrs %0, cpsr\nmsr cpsr_c, #0xDF" : "=r" (ret)); memcpy(buffer, lb, 32 * sizeof(uint16_t)); asm volatile ("msr cpsr_c, %0" : : "r" (ret)); } w/ -mcpu=arm7tdmi -mno-thumb-interwork -O2, GCC generates code such as: stmfd sp!, {r3, lr} mrs r3, cpsr msr cpsr_c, #0xDF ldr r1, .L3 mov r2, #64 bl memcpy msr cpsr_c, r3 ldmfd sp!, {r3, pc} What's suboptimal is the r3 push/pop. There's no need to preserve/restore r3 as it's a scratch register. The testcase forces r3 to be used, because I couldn't make a minimal testcase that would automatically do it, but I observed this in actual code (which is very close, if not identical, to the testcase). Without that (register var + asm("r3")), the generated code would pick r4 and generate code identical to above, which kinds of makes sense: stacking is required to r4 (while one could arguee why it would pick a caller-preserved reg anyway, a different bug IMHO). Gcc was a vanilla 4.9.1 compiled to arm-none-eabi: arm-none-eabi-gcc -v Using built-in specs. COLLECT_GCC=/usr/local/arm-none-eabi/bin/arm-none-eabi-gcc COLLECT_LTO_WRAPPER=/usr/local/arm-none-eabi-4.9.1/libexec/gcc/arm-none-eabi/4.9.1/lto-wrapper Target: arm-none-eabi Configured with: ../gcc-4.9.1/configure --target=arm-none-eabi --prefix=/usr/local/arm-none-eabi-4.9.1 --enable-interwork --enable-languages=c,c++ --with-newlib --with-headers=../newlib-20140525/newlib/libc/include --with-float=soft --enable-long-long Thread model: single gcc version 4.9.1 (GCC)