https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79359
--- Comment #2 from Raphael C <drraph at gmail dot com> --- As an additional data point in relation to Part 2 (that is without -ffast-math). In gcc 7 -O3 -ffinite-math-only gives f: movq QWORD PTR [rsp-16], xmm0 movss xmm3, DWORD PTR [rsp-12] movss xmm2, DWORD PTR [rsp-16] movaps xmm1, xmm3 movaps xmm0, xmm2 jmp __mulsc3 whereas in clang trunk it gives f: # @f movaps xmm1, xmm0 shufps xmm1, xmm1, 229 # xmm1 = xmm1[1,1,2,3] movaps xmm2, xmm0 mulss xmm2, xmm1 addss xmm2, xmm2 mulss xmm0, xmm0 mulss xmm1, xmm1 subss xmm0, xmm1 unpcklps xmm0, xmm2 # xmm0 = xmm0[0],xmm2[0],xmm0[1],xmm2[1] ret I am no longer convinced ICC is handling NaN and Inf correctly so have posted a query to their forum. However, it looks like gcc is not optimising as it could when -ffinite-math-only is enabled.