On Fri, Aug 16, 2019 at 2:17 PM Robin Dapp <rd...@linux.ibm.com> wrote:
>
> > So - what are you really after? (sorry if I don't remeber, testcase(s)
> > are missing
> > from this patch)
> >
> > To me it seems that 1) loses information if A + CST was done in a signed 
> > type
> > and we know that overflow doesn't happen because of that.  For the reverse
> > transformation we don't.  Btw, if you make A == A' + CST' then
> > you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled
> > so why do you need both transforms again?
>
> As far as I know the second transformation resulted from some suggestion
> to split what was the status back then.  Some things have changed in
> match.pd so I checked again:
>
> What I want to achieve in the end is getting rid of
>  add r1,-1
>  extend r1
>  add r1,1.
> emitted via doloop.
>
> We have
> (1)       (T)(A + CST1) + CST2 -> (T)(A) + (T)CST1 + CST2

Here + CST2 isn't part of the transform - is it just to restrict the
"first half" to a specific subset of cases?

> (2)       (T)(A) + (T)CST1 + CST2 -> (T)(A) + (T)(CST1 + CST2)

Likewise for (T)(A).

> (2better) (A +- CST1) +- CST2 -> A + CST3 (the one Marc mentioned)

But that's (1) with A == (T)(A' + CST1)

> (3)       (T)(A) + CST -> T(A + CST)
>
> Considering
>  unsigned long foo(unsigned int a)
>  {
>    if (a > 3 && a < INT_MAX - 100)
>      return (unsigned long)(a - 1) + 1;

So you want to either compute everything
in unsigned long or everything in unsigned int.
IMHO unsigned int is preferable.  So that
asks for

 (T)A + 1 -> (T)(A + 1)

if T is wider.  We are already performing the reverse
of T is narrower, (T)(A + 1) -> (T)A + 1 (IIRC).

>  }.
>
> This results in s390 assembly
>  ahi    %r2,-1
>  llgfr  %r2,%r2
>  aghi   %r2,1
>  br     %r14.
>
> With the second transform it becomes just
>  br     %r14
> due to
>
> (3) Applying pattern match.pd:2091, gimple-match.c:36515
>     Matching expression match.pd:111, gimple-match.c:65
> (2better) Applying pattern match.pd:1980, gimple-match.c:1011
>
> in evrp.
>
> I believe when I last tried long time ago it would not work that way but
> now it does, so (3) is basically an enabler of (2better).  As far as I
> can tell (2better) does not make use of range information but I didn't
> really now why it works the way it does using (3).
>
>
> Now for
>
>  unsigned long bar(unsigned int a)
>  {
>    if (a > 3 && a < INT_MAX - 100)
>      return (unsigned long)(a + 1) + ULONG_MAX;

Here we can't perform the add in the narrower type.  But
we do not want, generally, (T)(a + CST) -> (T)A + CST
when T is a widening conversion.  So for this case it makes
sense to restrict it to the case where we can combine the
constants (but I wonder if this needs to be done in a pass
like reassoc to be effective).

So - which case is it?  IIRC we want to handle small signed
constants but the code can end up unsigned.  For the
above we could write (unsigned long)((int)a + 1 - 1) and thus
sign-extend?  Or even avoid this if we know the range.
That is, it becomes the first case again (operation performed
in a smaller type).

>  }
>
> we have
>
> Folded into: _2 = a_7 + 1;
> Folding statement: _3 = (long unsigned int) _2;
> Folded into: _3 = (long unsigned int) _2;
> Folding statement: _6 = _3 + 18446744073709551615;
> (1) Applying pattern match.pd:2060, gimple-match.c:36437
>
> in vrp1.
>
> This is not being handled by (2alt) but by (1).  The same applies for
> long/int instead of ulong/uint except that we don't need range
> information there.
>
> Now, when removing (2) from my patch I run into nasty out-of-mem errors
> during bootstrap which I didn't find the reason for yet.  I thought of
> something like two match.md patterns that keep reverting the other's
> result but I think we have a recursion depth check in place there.
>
> > Also do
> >
> >   if ((did_replace || fold_all_stmts)
> >      && fold_stmt (...))
> >    {
> >    }
> >
> > to avoid extra work when folding does nothing.
>
> Does this work like this? We want to mark the statement as modified even
> if folding wasn't successful, I guess, and the gimple_set_modified ()
> call should not be that expensive? Could do a separate
>
>  if (fold_all_stmts && fold_stmt (...))

Sure, I was asking for a way to make fold_stmt conditional, it's of course
not 100% as simple as I quoted.

Richard.

> of course.
>
> Regards
>  Robin
>

Reply via email to