https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123225

--- Comment #13 from Victor Do Nascimento <victorldn at gcc dot gnu.org> ---
> So rather than restricting to PGO we could just handle the cases above and
> restrict uncounted loops to cases that don't require a forced epilogue.
Forgive my ignorance here, but surely we are talking about 2 separate (though
closely-related) problems...

If we can elide the epilogue, I understand we are definitely making the
vectorized code cheaper to execute (and smaller, improving the resulting
code-size), but surely we still need to make sure we get costing right, no?

No expensive epilogue will mean the loop becomes profitable faster, yes, but we
still need to either:

1. know whether we will execute enough iterations to reach that profitability
threshold (which is where the PGO idea comes in) or
2. ensure we have a conservative enough assumption about min iterations (e.g.
going back to Richi's idea that the vectorized loop should be no more expensive
than 2 scalar iterations) so that we always reject loops that will need too
many iterations for profitability.

The idea I have been working with was that we effectively apply both approaches
above:
1. Use PGO info when available or
2. Apply the very conservative cost requirement when PGO data not available.

And have the epilog eliding just ensuring more of our vectorized loops pass
these cost tests... Am I wrong in my thinking?

Reply via email to