https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63206

            Bug ID: 63206
           Summary: Gcc 4.9.1 Generated code needlessly stacks r3
           Product: gcc
           Version: 4.9.1
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: alexandre.nunes at gmail dot com

Created attachment 33459
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33459&action=edit
testcase

when compiling the following testcase code: 
#include <stdint.h>
#include <string.h>

static uint16_t lb[32];

void copy(uint16_t buffer[32])
{
  register uint32_t ret asm("r3");
  asm volatile ("mrs %0, cpsr\nmsr cpsr_c, #0xDF" : "=r" (ret));
  memcpy(buffer, lb, 32 * sizeof(uint16_t));
  asm volatile ("msr cpsr_c, %0" : : "r" (ret));
}


w/ -mcpu=arm7tdmi -mno-thumb-interwork -O2,  GCC generates code such as:

stmfd    sp!, {r3, lr}
mrs r3, cpsr
msr cpsr_c, #0xDF
ldr    r1, .L3
mov    r2, #64
bl    memcpy
msr cpsr_c, r3
ldmfd    sp!, {r3, pc}

What's suboptimal is the r3 push/pop. There's no need to preserve/restore r3 as
it's a scratch register.

The testcase forces r3 to be used, because I couldn't make a minimal testcase
that would automatically do it, but I observed this in actual code (which is
very close, if not identical, to the testcase).

Without that (register var + asm("r3")), the generated code would pick r4 and
generate code identical to above, which kinds of makes sense: stacking is
required to r4 (while one could arguee why it would pick a caller-preserved reg
anyway, a different bug IMHO).

Gcc was a vanilla 4.9.1 compiled to arm-none-eabi:
arm-none-eabi-gcc -v
Using built-in specs.
COLLECT_GCC=/usr/local/arm-none-eabi/bin/arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/local/arm-none-eabi-4.9.1/libexec/gcc/arm-none-eabi/4.9.1/lto-wrapper
Target: arm-none-eabi
Configured with: ../gcc-4.9.1/configure --target=arm-none-eabi
--prefix=/usr/local/arm-none-eabi-4.9.1 --enable-interwork
--enable-languages=c,c++ --with-newlib
--with-headers=../newlib-20140525/newlib/libc/include --with-float=soft
--enable-long-long
Thread model: single
gcc version 4.9.1 (GCC)

Reply via email to