https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980
--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> --- (In reply to Richard Biener from comment #4) > > Now, the testcase shows a missed optimization - we are unnecessarily > using a large VF because of > > t.c:2:21: note: ==> examining phi: ivtmp_21 = PHI <ivtmp_20(7), 8(2)> > t.c:2:21: note: get vectype for scalar type: unsigned int > t.c:2:21: note: vectype: vector(8) unsigned int > > and this causes the duplication in the first place. If we solve this > issue the issue will appear less often (but nothing prevents it in > principle - we just need a "real" reason to have a smaller data type > involved). I believe we have a bugreport for this already (using > vector inductions to get the scalar on-exits IVs). That's PR119860, note that it may still give you the two loads as the cost model may decide that V2DI isn't as profitable and unroll it to fully use the load bandwidth before the compare.