https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713

--- Comment #24 from Chris Elrod <elrodc at gmail dot com> ---
The dump looks like this:

  vect__67.78_217 = SQRT (vect__213.77_225);
  vect_ui33_68.79_248 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0,
1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0
} / vect__67.78_217;
  vect__71.80_249 = vect__246.59_65 * vect_ui33_68.79_248;
  vect_u13_73.81_250 = vect__187.71_14 * vect_ui33_68.79_248;
  vect_u23_75.82_251 = vect__200.74_5 * vect_ui33_68.79_248;

so the vrsqrt optimization happens later. g++ shows the same problems with
weird code generation. However this:

 /* sqrt(a)  = -0.5 * a * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0)
    rsqrt(a) = -0.5     * rsqrtss(a) * (a * rsqrtss(a) * rsqrtss(a) - 3.0) */

does not match this:

        vrsqrt14ps      %zmm1, %zmm2 # comparison and mask removed
        vmulps  %zmm1, %zmm2, %zmm0
        vmulps  %zmm2, %zmm0, %zmm1
        vmulps  %zmm6, %zmm0, %zmm0
        vaddps  %zmm7, %zmm1, %zmm1
        vmulps  %zmm0, %zmm1, %zmm1
        vrcp14ps        %zmm1, %zmm0
        vmulps  %zmm1, %zmm0, %zmm1
        vmulps  %zmm1, %zmm0, %zmm1
        vaddps  %zmm0, %zmm0, %zmm0
        vsubps  %zmm1, %zmm0, %zmm0

Recommendations on the next place to look for what's going on?

Reply via email to