https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62283
--- Comment #1 from Henrik Holst <holst at matmech dot com> --- (forgot to indent the end statement above.) The expected assembler code should be something like: movaps %xmm0, %xmm1 movups (%rdi), %xmm0 shufps $0, %xmm1, %xmm1 movups (%rsi), %xmm2 mulps %xmm1, %xmm0 addps %xmm2, %xmm0 movups %xmm0, (%rsi) ret