https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115841
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2024-07-16 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> --- Reduced testcase, fails with -Ofast -mavx512vl -mtune=znver4 --param vect-partial-vector-usage=1 -fcommon (-fcommon is the key from fortran so we can't re-align xl). For an arch with "proper" costs (aligned loads cheaper) one would swap the static/not static but even with -fno-vect-cost-model which should make three loads aligned here it doesn't reproduce. unsigned char xl[192]; static unsigned char A170[192*3]; void jerate (unsigned char *, unsigned char *); float foo (unsigned n) { jerate (xl, A170); unsigned i = 32; int kr = 1; float sfn11s = 0.f; float sfn12s = 0.f; do { int krm1 = kr - 1; long j = krm1; float a = (*(float(*)[n])A170)[j]; float b = (*(float(*)[n])xl)[j]; float c = a * b; float d = c * 6.93149983882904052734375e-1f; float e = (*(float(*)[n])A170)[j+48]; float f = (*(float(*)[n])A170)[j+96]; float g = d * e; sfn11s = sfn11s + g; float h = f * d; sfn12s = sfn12s + h; kr++; } while (--i != 0); float tem = sfn11s + sfn12s; return tem; }