https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65847
--- Comment #2 from Vincent Lefèvre <vincent-gcc at vinc17 dot net> --- I've just found the same issue. The code is a bit different (here, AFAIK, this is AVX), but I assume that the cause is the same. With -O2: foo: .LFB0: .cfi_startproc vaddsd %xmm3, %xmm1, %xmm1 vaddsd %xmm2, %xmm0, %xmm0 ret .cfi_endproc With -O3: foo: .LFB0: .cfi_startproc vmovq %xmm0, -40(%rsp) vmovq %xmm1, -32(%rsp) vmovapd -40(%rsp), %xmm5 vmovq %xmm2, -24(%rsp) vmovq %xmm3, -16(%rsp) vaddpd -24(%rsp), %xmm5, %xmm4 vmovaps %xmm4, -40(%rsp) vmovsd -32(%rsp), %xmm1 vmovsd -40(%rsp), %xmm0 ret .cfi_endproc Tested with: gcc (Debian 20190102-1) 9.0.0 20190102 (experimental) [trunk revision 267505]