[Bug tree-optimization/79460] gcc fails to optimise out a trivial additive loop for seemingly arbitrary numbers of iterations

amker at gcc dot gnu.org Mon, 13 Feb 2017 03:33:24 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79460


--- Comment #5 from amker at gcc dot gnu.org ---
(In reply to Jakub Jelinek from comment #4)
> (In reply to Richard Biener from comment #3)
> > In this case it is complete unrolling that can estimate the non-vector code
> > to constant fold but not the vectorized code.  OTOH it's quite excessive
> > work done by the unroller when doing this for large N...
> > 
> > And yes, SCEV final value replacement doesn't know how to handle float
> > reductions
> > (we have a different PR for that).
> 
> Doesn't handle float reductions nor vector (integer or vector) reductions.
> Even the vector ones would be useful, if e.g. to a vector every iteration
> adds a VECTOR_CST or similar, then it could be still nicely optimized.
Integer version should have already been supported now.

> 
> For the 202 case, it seems we are generating a scalar loop epilogue (not
> needed for 200) and somehow it seems something in the vector is actually
> able to figure out the floating point final value, because we get:
>   # p_2 = PHI <2.01e+2(5), p_12(7)>
>   # i_3 = PHI <200(5), i_13(7)>
> on the scalar loop epilogue.  So if something in the vectorizer is able to
> figure it out, why can't it just use that even in the case where no epilogue
> loop is needed?
IIUC, scev-ccp should be made query based interface so that it can be called
for each loop closed phi at different compilation stage.  It also needs to be
extended to cover basic floating point case like this.  Effectively, it need to
do the same transformation as vectorizer does now, but just thought it might be
a better place to do that.

[Bug tree-optimization/79460] gcc fails to optimise out a trivial additive loop for seemingly arbitrary numbers of iterations

Reply via email to