https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65847
Bug ID: 65847
Summary: SSE2 code for adding two structs is much worse at -O3
than at -O2
Product: gcc
Version: 6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jay.foad at gmail dot com
On x86_64 I get decent code at -O2:
$ cat zplus.c
typedef struct { double a, b; } Z;
Z zplus(Z x, Z y) { return (Z){ x.a + y.a, x.b + y.b }; }
$ gcc -O2 -S -o - zplus.c
...
zplus:
.LFB0:
.cfi_startproc
addsd %xmm3, %xmm1
addsd %xmm2, %xmm0
ret
.cfi_endproc
.LFE0:
...
but awful code at -O3:
$ gcc -O3 -S -o - zplus.c
...
zplus:
.LFB0:
.cfi_startproc
movq %xmm0, -40(%rsp)
movq %xmm1, -32(%rsp)
movq %xmm2, -56(%rsp)
movq %xmm3, -48(%rsp)
movupd -40(%rsp), %xmm1
movupd -56(%rsp), %xmm0
addpd %xmm0, %xmm1
movaps %xmm1, -72(%rsp)
movq -72(%rsp), %rdx
movq -64(%rsp), %rax
movq %rdx, -72(%rsp)
movsd -72(%rsp), %xmm0
movq %rax, -72(%rsp)
movsd -72(%rsp), %xmm1
ret
...
I see similar bad code generated by various versions of GCC, starting around
version 4.8:
gcc-4.8 (Ubuntu 4.8.3-12ubuntu3) 4.8.3
gcc (Ubuntu 4.9.1-16ubuntu6) 4.9.1
gcc (GCC) 6.0.0 20150422 (experimental)