Richard Biener <[email protected]> writes:
> On Thu, 13 Jan 2022, Jiufu Guo wrote:
...
>
>> - /* No need to check sign of the new step since below code takes care
>> - of this well. */
>> + /* Like cases shown in PR100740/102131, negtive step is not safe. */
>> + if (tree_int_cst_sign_bit (step))
>> + return false;
>> +
>> if (code != NE_EXPR
>> && (TREE_CODE (step) != INTEGER_CST
>> || !iv0->no_overflow || !iv1->no_overflow))
>
> I think for NE_EXPR the sign is not relevant. I think the key is
> that we require that iv0->no_overflow and for non-equality checks
> adjusting X + C1 < Y + C2 to X + C1 - C2 < Y is only valid if that
> does not cause any overflow on either side. On the LHS this is
Hi Richard,
Thanks a lot for your comments and ideas!
Right! The adjusting is safe only if we can make sure there is
no overflow/wrap on either side or say there is no overflow/wrap
on three 'iv's: {X,C1}, {Y,C2} and {X, C1 - C2}.
Or it may also ok if we can compute an assumption, under which
all three ivs are not overflowed/wrapped.
> only guaranteed if the absolute value of C1 - C2 is smaller than
> that of C1 and if it has the same sign.
I'm thinking this in another way:
When trying to do this transform in number_of_iterations_cond,
GT/GE is inverted to LT/LE, then the compare code would be:
LT/LE or NE.
For LT/LE, like {X, C1} < {Y, C2}, we can look it as iv0 is
chasing iv1. We would able to assume X < Y (may_be_zero would
be set later via number_of_iterations_lt/le).
1. If C1 < C2, iv0 can never catch up iv1. For examples:
{X, 1} < {Y, 2}; {X, -2} < {Y, -1}; {X, -2} < {Y, 1}.
And there must be at least one overflow/wrap in iv0,iv1, or iv.
This indicates, if the sign of (C1 - C1) is negative, then the
transform would be incorrect.
2. If C1 > C2, we still need to make sure all the ivs (iv0,
iv1 and combined iv) are not wrapped.
For C2 > 0, {Y,C2} should not cross MAX before {X, C1} catch up.
the assumption may like : (MAX-Y)/C2 > (Y-X)/(C1-C1)
For C1 < 0, {X,C1} should not down cross MIN
the assumption may like : (X-MIN)/-C1 > (Y-X)/(C1-C1)
For C1 > 0 and C2 < 0, iv0 and iv1 are walking to each other,
it would be almost safe.
For NE, it seems more interesting. The transformation depends
on 3 things: 1. the relation between X and Y; 2 the sign
of (C1-C2); 3. if iv0 and iv1 can be equal finally.
The 3rd one may be more special.
The good news is, number_of_iterations_ne seems able to handle NE.
>
> With the same reasoning we then know the new IV0 doesn't overflow.
>
> So something like the following. IIRC I've proposed sth similar
> a while back. I'm going to give it some testing, collecting
> testcases from related PRs.
>
> diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
> index b767056aeb0..74fa4f66ee2 100644
> --- a/gcc/tree-ssa-loop-niter.cc
> +++ b/gcc/tree-ssa-loop-niter.cc
> @@ -1890,17 +1890,28 @@ number_of_iterations_cond (class loop *loop,
> tree step = fold_binary_to_constant (MINUS_EXPR, step_type,
> iv0->step, iv1->step);
>
> - /* No need to check sign of the new step since below code takes
> care
> - of this well. */
> - if (code != NE_EXPR
> - && (TREE_CODE (step) != INTEGER_CST
> - || !iv0->no_overflow || !iv1->no_overflow))
> - return false;
> + /* For code other than NE_EXPR we have to ensure moving the
> evolution
> + of IV1 to that of IV0 does not introduce overflow. */
> + if (TREE_CODE (step) != INTEGER_CST
> + || !iv0->no_overflow || !iv1->no_overflow)
> + {
I was also trying to leverage no_overflow of iv0 and iv1. While it seems
the computation logic of no_overflow is related to the type of IV. If the
type of IV is signed, the C semantics may be used, overflow in signed
IV are treated UB, and then no_overflow would be true.
For unsigned IV, no_overflow would be false, even for the cases which
looks like:
"{10, 2} < {20, 1}", which would be ok to compute niter.
BR,
Jiufu
> + if (code != NE_EXPR)
> + return false;
> + iv0->no_overflow = false;
> + }
> + /* If the new step of IV0 has changed sign or is of greater
> + magnitude then we do not know whether IV0 does overflow
> + and thus the transform is not valid for code other than NE_EXPR
> */
> + else if (tree_int_cst_sign_bit (step) != tree_int_cst_sign_bit
> (iv0->step)
> + || wi::gtu_p (wi::abs (wi::to_widest (step)),
> + wi::abs (wi::to_widest (iv0->step))))
> + {
> + if (code != NE_EXPR)
> + return false;
> + iv0->no_overflow = false;
> + }
>
> iv0->step = step;
> - if (!POINTER_TYPE_P (type))
> - iv0->no_overflow = false;
> -
> iv1->step = build_int_cst (step_type, 0);
> iv1->no_overflow = true;
> }
>
>
>> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr100740.c
>> b/gcc/testsuite/gcc.c-torture/execute/pr100740.c
>> new file mode 100644
>> index 00000000000..381cdeb947a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.c-torture/execute/pr100740.c
>> @@ -0,0 +1,13 @@
>> +/* PR tree-optimization/100740 */
>> +
>> +unsigned a, b;
>> +int
>> +main ()
>> +{
>> + unsigned c = 0;
>> + for (a = 0; a < 2; a++)
>> + for (b = 0; b < 2; b++)
>> + if (++c < a)
>> + __builtin_abort ();
>> + return 0;
>> +}
>>