https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713
--- Comment #24 from Chris Elrod <elrodc at gmail dot com> --- The dump looks like this: vect__67.78_217 = SQRT (vect__213.77_225); vect_ui33_68.79_248 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 } / vect__67.78_217; vect__71.80_249 = vect__246.59_65 * vect_ui33_68.79_248; vect_u13_73.81_250 = vect__187.71_14 * vect_ui33_68.79_248; vect_u23_75.82_251 = vect__200.74_5 * vect_ui33_68.79_248; so the vrsqrt optimization happens later. g++ shows the same problems with weird code generation. However this: /* sqrt(a) = -0.5 * a * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) rsqrt(a) = -0.5 * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) */ does not match this: vrsqrt14ps %zmm1, %zmm2 # comparison and mask removed vmulps %zmm1, %zmm2, %zmm0 vmulps %zmm2, %zmm0, %zmm1 vmulps %zmm6, %zmm0, %zmm0 vaddps %zmm7, %zmm1, %zmm1 vmulps %zmm0, %zmm1, %zmm1 vrcp14ps %zmm1, %zmm0 vmulps %zmm1, %zmm0, %zmm1 vmulps %zmm1, %zmm0, %zmm1 vaddps %zmm0, %zmm0, %zmm0 vsubps %zmm1, %zmm0, %zmm0 Recommendations on the next place to look for what's going on?