https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81196

--- Comment #3 from amker at gcc dot gnu.org ---
(In reply to Richard Biener from comment #1)
> Probably some more elaborate handling in number_of_iterations_cond is
> required:
> 
>   /* We can handle the case when neither of the sides of the comparison is
>      invariant, provided that the test is NE_EXPR.  This rarely occurs in
>      practice, but it is simple enough to manage.  */
>   if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step))
>     {
>       tree step_type = POINTER_TYPE_P (type) ? sizetype : type;
>       if (code != NE_EXPR)
>         return false;
> 
>       iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type,
>                                            iv0->step, iv1->step);
>       iv0->no_overflow = false;
>       iv1->step = build_int_cst (step_type, 0);
>       iv1->no_overflow = true;
>     }
> 
> I think this exit is premature and the following works for the testcase.
> I suppose exiting is still required but can be moved to a later point,
> or the helpers now fully handle the case of non-constant iv1 ...
> Vectorization still fails with this due to runtime aliasing so it
> probably exposes some wrong-code issue.  CCing Bin who is now most
> familiar with the niter code.
> 
> Index: gcc/tree-ssa-loop-niter.c
> ===================================================================
> --- gcc/tree-ssa-loop-niter.c   (revision 249638)
> +++ gcc/tree-ssa-loop-niter.c   (working copy)
> @@ -1674,14 +1674,14 @@ number_of_iterations_cond (struct loop *
>    if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step))
>      {
>        tree step_type = POINTER_TYPE_P (type) ? sizetype : type;
> -      if (code != NE_EXPR)
> -       return false;
> -
> -      iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type,
> -                                          iv0->step, iv1->step);
> -      iv0->no_overflow = false;
> -      iv1->step = build_int_cst (step_type, 0);
> -      iv1->no_overflow = true;
> +      if (code == NE_EXPR)
> +       {
> +         iv0->step = fold_binary_to_constant (MINUS_EXPR, step_type,
> +                                              iv0->step, iv1->step);
> +         iv0->no_overflow = false;
> +         iv1->step = build_int_cst (step_type, 0);
> +         iv1->no_overflow = true;
> +       }
No, at least we need to adjust for other code like LT_EXPR/LE_EXPR too. 
Following code can't handle comparison with both sides non-zero ivs.
I guess it only handles NE_EXPR, otherwise it's possible to end up with wrong
result because of wrapping behavior.  Considering below test:

unsigned int i = 0xfffffff0, j=0xfffffff8;
for (; i < j; i++, j+=2)
it only iterates for 4 times before j wrapping to 0.  It's not equal to:
unsigned int i = 0xfffffff0, j=0xfffffff8;
for (; i < j; i--)

The tricky part is to identify safe cases.  I will try to improve this.

Thanks.

Reply via email to