[Bug tree-optimization/79460] gcc fails to optimise out a trivial additive loop for seemingly arbitrary numbers of iterations

jakub at gcc dot gnu.org Mon, 13 Feb 2017 03:16:16 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79460


Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> In this case it is complete unrolling that can estimate the non-vector code
> to constant fold but not the vectorized code.  OTOH it's quite excessive
> work done by the unroller when doing this for large N...
> 
> And yes, SCEV final value replacement doesn't know how to handle float
> reductions
> (we have a different PR for that).

Doesn't handle float reductions nor vector (integer or vector) reductions.
Even the vector ones would be useful, if e.g. to a vector every iteration adds
a VECTOR_CST or similar, then it could be still nicely optimized.

For the 202 case, it seems we are generating a scalar loop epilogue (not needed
for 200) and somehow it seems something in the vector is actually able to
figure out the floating point final value, because we get:
  # p_2 = PHI <2.01e+2(5), p_12(7)>
  # i_3 = PHI <200(5), i_13(7)>
on the scalar loop epilogue.  So if something in the vectorizer is able to
figure it out, why can't it just use that even in the case where no epilogue
loop is needed?

[Bug tree-optimization/79460] gcc fails to optimise out a trivial additive loop for seemingly arbitrary numbers of iterations

Reply via email to