https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123225
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Victor Do Nascimento from comment #13) > > So rather than restricting to PGO we could just handle the cases above and > > restrict uncounted loops to cases that don't require a forced epilogue. > Forgive my ignorance here, but surely we are talking about 2 separate > (though closely-related) problems... > > If we can elide the epilogue, I understand we are definitely making the > vectorized code cheaper to execute (and smaller, improving the resulting > code-size), but surely we still need to make sure we get costing right, no? Yes. Just with the epilog the purpoted idea of a vector iteration being cheaper than two scalar iterations does not work out - we'd at least execute another scalar iterations worth of work in the epilog. > No expensive epilogue will mean the loop becomes profitable faster, yes, but > we still need to either: > > 1. know whether we will execute enough iterations to reach that > profitability threshold (which is where the PGO idea comes in) or > 2. ensure we have a conservative enough assumption about min iterations > (e.g. going back to Richi's idea that the vectorized loop should be no more > expensive than 2 scalar iterations) so that we always reject loops that will > need too many iterations for profitability. > > The idea I have been working with was that we effectively apply both > approaches above: > 1. Use PGO info when available or > 2. Apply the very conservative cost requirement when PGO data not available. > > And have the epilog eliding just ensuring more of our vectorized loops pass > these cost tests... Am I wrong in my thinking? I think 2 will not work out w/o eliding the epilog (unless you ignore that we have the epilog in that heuristic). I'd say 1. works, but we'd need to error on the safe side (assume the worst) when PGO info is not available. Meaning it would be very nice if we get the simple cases where we can elide the epilog without extra work done, to validate if 2. works in practice.
