FW: GCC global variable register optimization issue

2014-03-09 Thread ganboing
gcc 4.8.2 fails to do optimization on  global register variables when
compiling on x86_64 Linux.

Consider the following code:

-
include    

register uint64_t i0_BP __asm__ ("r14");
register uint64_t i0_SP __asm__ ("r15");

void test(void) {
    *((uint64_t*) (i0_SP - 8)) = i0_BP;
    i0_BP = i0_SP - 0x8;
    i0_SP -= 0x100;
    i0_SP = i0_BP;
    i0_BP = *((uint64_t*) i0_SP);
    i0_SP += 0x8;
    return;
}
-

Apply either ‘-O3’ or ‘-Os’ option to gcc, the final object file gives the
same results as follows:

-
:
   0:   lea    0xfff8(%r15),%rcx
   4:   mov    %r14,%rdx
   7:   mov    %r15,%rax
   a:   mov    %r14,0xfff8(%r15)
   e:   mov    %rcx,%r14
  11:   mov    %rcx,%r15
  14:   mov    %rdx,%r14
  17:   mov    %rax,%r15
  1a:   retq
-

Here we just try to emulate a function call. In the object file, there are
apparently lots of redundant ‘mov’s between registers. It seems to be a bug
in gcc since we have already apply the maximum optimization level possible.

Environment:

On CentOS 5.10 (Linux 2.6.18 x86_64) using GCC 4.8.2

Using built-in specs.
COLLECT_GCC=gcc4
COLLECT_LTO_WRAPPER=/usr/local/GNU/gcc-4.8.2/libexec/gcc/x86_64-unknown-linu
x-gnu/4.8.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.8.2/configure --prefix=/usr/local/GNU/gcc-4.8.2
--enable-clocale=generic
Thread model: posix
gcc version 4.8.2 (GCC)




[Bug target/60480] New: gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.

2014-03-09 Thread ganboing at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480

Bug ID: 60480
   Summary: gcc 4.8.2 fails to do optimization on global register
variables when compiling on x86_64 Linux.
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: ganboing at gmail dot com

gcc 4.8.2 fails to do optimization on  global register variables when compiling
on x86_64 Linux.

Consider the following code:

 include

register uint64_t i0_BP __asm__ ("r14");
register uint64_t i0_SP __asm__ ("r15");

void test(void) {
*((uint64_t*) (i0_SP - 8)) = i0_BP;
i0_BP = i0_SP - 0x8;
i0_SP -= 0x100;
i0_SP = i0_BP;
i0_BP = *((uint64_t*) i0_SP);
i0_SP += 0x8;
return;
}

Apply either ‘-O3’ or ‘-Os’ option to gcc, the final object file gives the same
results as follows:

:
   0:   lea0xfff8(%r15),%rcx
   4:   mov%r14,%rdx
   7:   mov%r15,%rax
   a:   mov%r14,0xfff8(%r15)
   e:   mov%rcx,%r14
  11:   mov%rcx,%r15
  14:   mov%rdx,%r14
  17:   mov%rax,%r15
  1a:   retq

Here we just try to emulate a frame establishment. In the object file, there
are apparently lots of redundant ‘mov’s between registers. It seems to be a bug
in gcc since we have already apply the maximum optimization level possible.

Environment:

On CentOS 5.10 (Linux 2.6.18 x86_64) using GCC 4.8.2

Using built-in specs.
COLLECT_GCC=gcc4
COLLECT_LTO_WRAPPER=/usr/local/GNU/gcc-4.8.2/libexec/gcc/x86_64-unknown-linux-gnu/4.8.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.8.2/configure --prefix=/usr/local/GNU/gcc-4.8.2
--enable-clocale=generic
Thread model: posix
gcc version 4.8.2 (GCC)

[Bug target/60480] gcc 4.8.2 fails to do optimization on global register variables when compiling on x86_64 Linux.

2014-03-10 Thread ganboing at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60480

--- Comment #2 from ganboing at gmail dot com ---
(In reply to Andrew Pinski from comment #1)
> This is due to x86 being a small register class target.

The thing is that x86_64 has 16 GPRs, and register r12-r15 are preserved across
function calls (SYSV ABI x86_64). The should be no reason that such opt. can't
be done.