https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81196
--- Comment #3 from amker at gcc dot gnu.org --- (In reply to Richard Biener from comment #1) > Probably some more elaborate handling in number_of_iterations_cond is > required: > > /* We can handle the case when neither of the sides of the comparison is > invariant, provided that the test is NE_EXPR. This rarely occurs in > practice, but it is simple enough to manage. */ > if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step)) > { > tree step_type = POINTER_TYPE_P (type) ? sizetype : type; > if (code != NE_EXPR) > return false; > > iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type, > iv0->step, iv1->step); > iv0->no_overflow = false; > iv1->step = build_int_cst (step_type, 0); > iv1->no_overflow = true; > } > > I think this exit is premature and the following works for the testcase. > I suppose exiting is still required but can be moved to a later point, > or the helpers now fully handle the case of non-constant iv1 ... > Vectorization still fails with this due to runtime aliasing so it > probably exposes some wrong-code issue. CCing Bin who is now most > familiar with the niter code. > > Index: gcc/tree-ssa-loop-niter.c > =================================================================== > --- gcc/tree-ssa-loop-niter.c (revision 249638) > +++ gcc/tree-ssa-loop-niter.c (working copy) > @@ -1674,14 +1674,14 @@ number_of_iterations_cond (struct loop * > if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step)) > { > tree step_type = POINTER_TYPE_P (type) ? sizetype : type; > - if (code != NE_EXPR) > - return false; > - > - iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type, > - iv0->step, iv1->step); > - iv0->no_overflow = false; > - iv1->step = build_int_cst (step_type, 0); > - iv1->no_overflow = true; > + if (code == NE_EXPR) > + { > + iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type, > + iv0->step, iv1->step); > + iv0->no_overflow = false; > + iv1->step = build_int_cst (step_type, 0); > + iv1->no_overflow = true; > + } No, at least we need to adjust for other code like LT_EXPR/LE_EXPR too. Following code can't handle comparison with both sides non-zero ivs. I guess it only handles NE_EXPR, otherwise it's possible to end up with wrong result because of wrapping behavior. Considering below test: unsigned int i = 0xfffffff0, j=0xfffffff8; for (; i < j; i++, j+=2) it only iterates for 4 times before j wrapping to 0. It's not equal to: unsigned int i = 0xfffffff0, j=0xfffffff8; for (; i < j; i--) The tricky part is to identify safe cases. I will try to improve this. Thanks.