In the provided testcase, gcc spills an xmm register onto the stack even though
there is only one register being used. This does not occur with similar code
using general purpose registers.
const void* test(int action, void* ptr)
{
static void * const addrs[] = {&&L1, &&L2};
if (action == 0) {
return addrs;
} else {
char* ip = ptr;
register double reg_f_a;
double reg_f[1];
reg_f_a = 0.0;
reg_f[0] = 0.0;
goto *ip;
L1: {
int t1 = *(int*)(++ip);
reg_f_a = reg_f_a + reg_f[t1];
goto *(++ip);
}
L2:
*(double*)ptr = reg_f_a;
}
return 0;
}
The above code compiled with -O3 -march=i686 -msse2 -mfpmath=sse produces the
following bit of assembly
movl1(%ebx), %eax
addl$2, %ebx
movsd -32(%ebp), %xmm0
addsd -16(%ebp,%eax,8), %xmm0
movl%ebx, %eax
movsd %xmm0, -32(%ebp)
jmp *%eax
The xmm0 register should remain the home register for reg_f_a, so there should
be no need for the store/load. Other usages of xmm0 should be placed in xmm1.
So the output should read:
movl1(%ebx), %eax
addl$2, %ebx
addsd -16(%ebp,%eax,8), %xmm0
movl%ebx, %eax
jmp *%eax
As a possibly related issue, there is also no reason why a copy of %ebx is made
prior to performing the jump. This could just as easily be
movl1(%ebx), %eax
addl$2, %ebx
addsd -16(%ebp,%eax,8), %xmm0
jmp *%ebx
So, as you can see, three out of the seven instructions can be removed, as well
as two of four memory references.
The version of gcc is 4.2.3 (Ubuntu 4.2.3-2ubuntu7)
--
Summary: register allocation spills floats needlessly
Product: gcc
Version: 4.2.3
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jstrother9109 at gmail dot com
GCC build triplet: i686-pc-linux-gnu
GCC host triplet: i686-pc-linux-gnu
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37488