Richard Biener <rguent...@suse.de> writes:

> On Thu, 2 Sep 2021, Jiufu Guo wrote:
>
>> When transform
>>   {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step}
>> to
>>   {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0}
>> 
>> There would be error if 'iv0.step - iv1.step' in negative,
>> for which means run until wrap/overflow.
>> 
>> For example:
>>    {1, +, 1} <= {4, +, 3} => {1, +, -2} <= {4, +, 0}
>> 
>> This patch avoid this kind transform.
>> 
>> Bootstrap and regtest pass on ppc64le.
>> Is this ok for trunk?
>
> This looks like PR100740, see the discussion starting at
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571570.html
Oh, thanks!
>
> We seem to be at a dead end figuring what's exactly required
> to make the transform valid and I have my doubts that arguing
> with overflow we're not running in circles.
>
> My last attempt was
>
> +      if (code != NE_EXPR)
> +       {
> +         if (TREE_CODE (step) != INTEGER_CST)
> +           return false;
> +         if (!iv0->no_overflow || !iv1->no_overflow)
> +           return false;
> +         /* The new IV does not overflow if the step has the same
> +            sign and is less in magnitude.  */
> +         if (tree_int_cst_sign_bit (iv0->step) != tree_int_cst_sign_bit 
> (step)
> +             || wi::geu_p (wi::abs (wi::to_wide (step)),
> +                           wi::abs (wi::to_wide (iv0->step))))
> +           return false;
> +       }
>
> where your patch at least misses { 1, +, -1 } < { -3, + , -3 }
> transforming to { 1, +, +2 } < -3, thus a positive step but
> we're still iterating unti wrap?  That is, the sign of the
> combined step cannot really be the issue at hand.
hmm, this transform may be invalid for a few cases.
If the "iv0->step - iv1->step" is possitive, there may be some
cases are not valid on LE/LT, like the example you said (or
{0, +, 5} < {MAX - 30, 3}, iv0 walks faster, but iv1 overflow
early).
Also it maybe circles or inifinite occur on NE cases.

Similary with this patch to check if combined STEP is negative,
I find in the links, there is also talking about 
'if (tree_int_cst_lt (iv0->step, iv1->step))'
For this kind of cases LE/LT, the transformation seems always
invalid.  right?

>
> Bin argued my condition is too strict and I agreed, see
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574334.html
> which is the last mail in the thread sofar (still without a fix :/)
>
> Somewhere it was said that we eventually should simply put
> preconditions into ->assumptions rather than trying to
> check ->no_overflow and friends, not sure if that's really
> applicable here.
Yeap, adding to ->assumptions could help some kind of cases,
the assumption may include "checking on iv0.base/iv1.base".

Like the example "{0, +, 5} < {MAX - 30, 3}", it become
"{0, +, 2} < {MAX - 30, 0}", this calls number_of_iterations_lt
and get a niter, which is not same with the real niter
(the original real niter maybe 10: 30/3 or less if iv1 does not
overflow). 
For, "{base0, +, 5} < {base1, 3}", we may get a niter from
"{base0, +, 0} < {base1, 3}", and then use 'assumption' which
checking if the niter is valid.

And for "NE_EXPR", the assumption may be more complicate
on possible circles/inifinite loops.

BR.
Jiufu

>
> Richard.
>
>> BR.
>> Jiufu
>> 
>> gcc/ChangeLog:
>> 
>> 2021-09-02  Jiufu Guo  <guoji...@linux.ibm.com>
>> 
>>      PR tree-optimization/102131
>>      * tree-ssa-loop-niter.c (number_of_iterations_cond):
>>      Avoid transform until wrap condition
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> 2021-09-02  Jiufu Guo  <guoji...@linux.ibm.com>
>> 
>>      PR tree-optimization/102131
>>      * gcc.dg/pr102131.c: New test.
>> 
>> ---
>>  gcc/tree-ssa-loop-niter.c       | 18 +++++++++
>>  gcc/testsuite/gcc.dg/pr102131.c | 69 +++++++++++++++++++++++++++++++++
>>  2 files changed, 87 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/pr102131.c
>> 
>> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
>> index 7af92d1c893..52ce517af89 100644
>> --- a/gcc/tree-ssa-loop-niter.c
>> +++ b/gcc/tree-ssa-loop-niter.c
>> @@ -1866,6 +1866,24 @@ number_of_iterations_cond (class loop *loop,
>>            || !iv0->no_overflow || !iv1->no_overflow))
>>      return false;
>>  
>> +      /* GT/GE has been transformed to LT/LE already.
>> +    cmp_code could be LT, LE or NE
>> +
>> +    For LE/LT transform
>> +    {iv0.base, iv0.step} LT/LE {iv1.base, iv1.step}
>> +    to
>> +    {iv0.base, iv0.step - iv1.step} LT/LE {iv1.base, 0}
>> +    Negative iv0.step - iv1.step means decreasing until wrap,
>> +    then the transform is not accurate.
>> +
>> +    For example:
>> +    {1, +, 1} <= {4, +, 3}
>> +    is not same with
>> +    {1, +, -2} <= {4, +, 0}
>> +       */
>> +      if ((code == LE_EXPR || code == LT_EXPR) && tree_int_cst_sign_bit 
>> (step))
>> +    return false;
>> +
>>        iv0->step = step;
>>        if (!POINTER_TYPE_P (type))
>>      iv0->no_overflow = false;
>> diff --git a/gcc/testsuite/gcc.dg/pr102131.c 
>> b/gcc/testsuite/gcc.dg/pr102131.c
>> new file mode 100644
>> index 00000000000..0fcfaa132da
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr102131.c
>> @@ -0,0 +1,69 @@
>> +/* { dg-do run } */
>> +
>> +unsigned int a;
>> +int
>> +fun1 ()
>> +{
>> +  unsigned b = 0;
>> +  int c = 1;
>> +  for (; b < 3; b++)
>> +    {
>> +      while (c < b)
>> +    __builtin_abort ();
>> +      for (a = 0; a < 3; a++)
>> +    c++;
>> +    }
>> +  return 0;
>> +}
>> +
>> +unsigned b;
>> +int
>> +fun2 ()
>> +{
>> +  unsigned c = 0;
>> +  for (a = 0; a < 2; a++)
>> +    for (b = 0; b < 2; b++)
>> +      if (++c < a)
>> +    __builtin_abort ();
>> +  return 0;
>> +}
>> +
>> +int
>> +fun3 (void)
>> +{
>> +  unsigned a, b, c = 0;
>> +  for (a = 0; a < 10; a++)
>> +    {
>> +      for (b = 0; b < 2; b++)
>> +    {
>> +      c++;
>> +      if (c < a)
>> +        {
>> +          __builtin_abort ();
>> +        }
>> +    }
>> +    }
>> +  return 0;
>> +}
>> +
>> +int
>> +fun4 ()
>> +{
>> +  unsigned i;
>> +  for (i = 0; i < 3; ++i)
>> +    {
>> +      if (i > i * 2)
>> +    __builtin_abort ();
>> +    }
>> +  return 0;
>> +}
>> +
>> +int
>> +main ()
>> +{
>> +  fun1 ();
>> +  fun2 ();
>> +  fun3 ();
>> +  fun4 ();
>> +  return 0;
>> +}
>> 

Reply via email to