[Bug target/125174] [17 Regression] 5% slowdown of tonto on Zen3 since r17-223-ga22b31304e0a1a

rguenth at gcc dot gnu.org via Gcc-bugs Wed, 06 May 2026 04:02:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125174


--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'll note that SIMD clone vector calls are not costed at all ...

-shell2.fppized.f90:971:24: optimized: loop vectorized using 32 byte vectors
and unroll factor 4
+shell2.fppized.f90:971:24: optimized: loop vectorized using 16 byte vectors
and unroll factor 2
 shell2.fppized.f90:971:24: optimized:  loop versioned for vectorization
because of possible aliasing
-shell2.fppized.f90:971:24: optimized: epilogue loop vectorized using 16 byte
vectors and unroll factor 2

is a difference, shown for

                  do k = 1, k_max
                    k1 = k_x(k);    k2 = k_y(k);    k3 = k_z(k)
                    dot1 = k1*P1+k2*P2+k3*P3
                    dot2 = g4 * (k1*k1+k2*k2+k3*k3)
                    res_ij(k) = res_ij(k) + therm(k) * (fac1 *
exp(cmplx(dot2,dot1,kind=kind((1.0d0,1.0d0)))))
                  end do

[Bug target/125174] [17 Regression] 5% slowdown of tonto on Zen3 since r17-223-ga22b31304e0a1a

Reply via email to