https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63206
Bug ID: 63206
Summary: Gcc 4.9.1 Generated code needlessly stacks r3
Product: gcc
Version: 4.9.1
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: alexandre.nunes at gmail dot com
Created attachment 33459
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33459&action=edit
testcase
when compiling the following testcase code:
#include <stdint.h>
#include <string.h>
static uint16_t lb[32];
void copy(uint16_t buffer[32])
{
register uint32_t ret asm("r3");
asm volatile ("mrs %0, cpsr\nmsr cpsr_c, #0xDF" : "=r" (ret));
memcpy(buffer, lb, 32 * sizeof(uint16_t));
asm volatile ("msr cpsr_c, %0" : : "r" (ret));
}
w/ -mcpu=arm7tdmi -mno-thumb-interwork -O2, GCC generates code such as:
stmfd sp!, {r3, lr}
mrs r3, cpsr
msr cpsr_c, #0xDF
ldr r1, .L3
mov r2, #64
bl memcpy
msr cpsr_c, r3
ldmfd sp!, {r3, pc}
What's suboptimal is the r3 push/pop. There's no need to preserve/restore r3 as
it's a scratch register.
The testcase forces r3 to be used, because I couldn't make a minimal testcase
that would automatically do it, but I observed this in actual code (which is
very close, if not identical, to the testcase).
Without that (register var + asm("r3")), the generated code would pick r4 and
generate code identical to above, which kinds of makes sense: stacking is
required to r4 (while one could arguee why it would pick a caller-preserved reg
anyway, a different bug IMHO).
Gcc was a vanilla 4.9.1 compiled to arm-none-eabi:
arm-none-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=/usr/local/arm-none-eabi/bin/arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/local/arm-none-eabi-4.9.1/libexec/gcc/arm-none-eabi/4.9.1/lto-wrapper
Target: arm-none-eabi
Configured with: ../gcc-4.9.1/configure --target=arm-none-eabi
--prefix=/usr/local/arm-none-eabi-4.9.1 --enable-interwork
--enable-languages=c,c++ --with-newlib
--with-headers=../newlib-20140525/newlib/libc/include --with-float=soft
--enable-long-long
Thread model: single
gcc version 4.9.1 (GCC)