http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480
Bug ID: 60480
Summary: gcc 4.8.2 fails to do optimization on global register
variables when compiling on x86_64 Linux.
Product: gcc
Version: 4.8.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: ganboing at gmail dot com
gcc 4.8.2 fails to do optimization on global register variables when compiling
on x86_64 Linux.
Consider the following code:
include <stdint.h>
register uint64_t i0_BP __asm__ ("r14");
register uint64_t i0_SP __asm__ ("r15");
void test(void) {
*((uint64_t*) (i0_SP - 8)) = i0_BP;
i0_BP = i0_SP - 0x8;
i0_SP -= 0x100;
i0_SP = i0_BP;
i0_BP = *((uint64_t*) i0_SP);
i0_SP += 0x8;
return;
}
Apply either ‘-O3’ or ‘-Os’ option to gcc, the final object file gives the same
results as follows:
<test>:
0: lea 0xfffffffffffffff8(%r15),%rcx
4: mov %r14,%rdx
7: mov %r15,%rax
a: mov %r14,0xfffffffffffffff8(%r15)
e: mov %rcx,%r14
11: mov %rcx,%r15
14: mov %rdx,%r14
17: mov %rax,%r15
1a: retq
Here we just try to emulate a frame establishment. In the object file, there
are apparently lots of redundant ‘mov’s between registers. It seems to be a bug
in gcc since we have already apply the maximum optimization level possible.
Environment:
On CentOS 5.10 (Linux 2.6.18 x86_64) using GCC 4.8.2
Using built-in specs.
COLLECT_GCC=gcc4
COLLECT_LTO_WRAPPER=/usr/local/GNU/gcc-4.8.2/libexec/gcc/x86_64-unknown-linux-gnu/4.8.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.8.2/configure --prefix=/usr/local/GNU/gcc-4.8.2
--enable-clocale=generic
Thread model: posix
gcc version 4.8.2 (GCC)