https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115340
--- Comment #4 from rdapp.gcc at gmail dot com --- > That said - if DR analysis could, say, "force" a particular VF where it > knows that gaps are closed we might "virtually" unroll this and thus > detect it as a group of contiguous 16 stores. Now we'd need to do the > same virtual unrolling for all other stmts of course. > > I think it would be easier if we'd somehow detect this situation beforehand > and actually perform the unrolling - we might want to do it with a > if (.LOOP_VECTORIZED (...)) versioning scheme though. I do wonder how > common such loops are though. > > It might be also possible to override cost considerations of early > unrolling with -O3 (aka when vectorization is enabled) and when the > number of iterations matches the gap of related DRs (but as said, it > looks like a very special thing to do). > > That said - I do plan to change the vectorizer from iterating over modes > to iterating over VFs which means we could perform the unrolling implied > by the VF on the vectorizer IL (SLP) and (re-)perform group discovery > afterwards. > > For a more general loop we'd essentially apply blocking with the desired > VF, unroll that blocking loop and apply BB vectorization. > > So to make the point - I don't like how handling this special case within > the current vectorizer framework pays off with the cost this will have > (I'm not sure it's really feasible to add even). Instead this looks > like in need of a vectorization enablement pre-transform to me. OK, sounds reasonable. And yeah, I wouldn't claim this kind of loop is common, it's obviously an x264 thing. Perhaps in other codecs but I haven't really checked. Another thought I had as we already know that SLP handles this more gracefully: Would it make sense to "just" defer to BB vectorization and have loop vectorization not do anything, provided we could detect the pattern with certainty? That would still be special casing the situation but potentially less intrusive than "Hail Mary" unrolling.