https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120457
--- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #1) > double __attribute__((noinline,noclone)) > compute_integral (double w_1[18]) > { > double A = 0; > double t33[2][6] = {{0.0, 0.0, 0.0, 0.0, 0.0, 0.0}, > {0.0, 0.0, 0.0, 0.0, 0.0, 0.0}}; > double t43[2] = {0.0, 0.0}; > double t31[2][2] = {{1.0, 1.0}, {1.0, 1.0}}; > double t32[2][3] = {{0.0, 0.0, 1.0}, {0.0, 0.0, 1.0}}; > > for (int ip_1 = 0; ip_1 < 2; ++ip_1) > { > #pragma GCC unroll 0 > for (int i_0 = 0; i_0 < 6; ++i_0) > t33[ip_1][i_0] = ((w_1[i_0*3] * t32[ip_1][0]) > + (w_1[i_0*3+2] * t32[ip_1][2])); --- the loop is > not vectorized anymore for power W/o -fno-vect-cost-model, power doesn't vectorize the loop, because it's suboptimal.