https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2012-10-06 07:54:57 |2021-2-11 --- Comment #30 from Richard Biener <rguenth at gcc dot gnu.org> --- For the non-reduced testcase the problem is (still) that there is no grouped store, the only stores left at the point of vectorization are grid(i,j,k) = grid(i,j,k) + s01 grid(i,j2,k) = grid(i,j2,k) + s03 grid(i,j,k2) = grid(i,j,k2) + s02 grid(i,j2,k2) = grid(i,j2,k2) + s04 so the coef_xy and coef_x arrays are completely elided. And the above stores are not contiguous. The approaches to start from arbitrary seeds with SLP vectorization would eventually help here (likewise of course starting from the loads which is something that's brought up at some points).