On Fri, Aug 16, 2019 at 2:17 PM Robin Dapp <[email protected]> wrote:
>
> > So - what are you really after? (sorry if I don't remeber, testcase(s)
> > are missing
> > from this patch)
> >
> > To me it seems that 1) loses information if A + CST was done in a signed
> > type
> > and we know that overflow doesn't happen because of that. For the reverse
> > transformation we don't. Btw, if you make A == A' + CST' then
> > you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled
> > so why do you need both transforms again?
>
> As far as I know the second transformation resulted from some suggestion
> to split what was the status back then. Some things have changed in
> match.pd so I checked again:
>
> What I want to achieve in the end is getting rid of
> add r1,-1
> extend r1
> add r1,1.
> emitted via doloop.
>
> We have
> (1) (T)(A + CST1) + CST2 -> (T)(A) + (T)CST1 + CST2
Here + CST2 isn't part of the transform - is it just to restrict the
"first half" to a specific subset of cases?
> (2) (T)(A) + (T)CST1 + CST2 -> (T)(A) + (T)(CST1 + CST2)
Likewise for (T)(A).
> (2better) (A +- CST1) +- CST2 -> A + CST3 (the one Marc mentioned)
But that's (1) with A == (T)(A' + CST1)
> (3) (T)(A) + CST -> T(A + CST)
>
> Considering
> unsigned long foo(unsigned int a)
> {
> if (a > 3 && a < INT_MAX - 100)
> return (unsigned long)(a - 1) + 1;
So you want to either compute everything
in unsigned long or everything in unsigned int.
IMHO unsigned int is preferable. So that
asks for
(T)A + 1 -> (T)(A + 1)
if T is wider. We are already performing the reverse
of T is narrower, (T)(A + 1) -> (T)A + 1 (IIRC).
> }.
>
> This results in s390 assembly
> ahi %r2,-1
> llgfr %r2,%r2
> aghi %r2,1
> br %r14.
>
> With the second transform it becomes just
> br %r14
> due to
>
> (3) Applying pattern match.pd:2091, gimple-match.c:36515
> Matching expression match.pd:111, gimple-match.c:65
> (2better) Applying pattern match.pd:1980, gimple-match.c:1011
>
> in evrp.
>
> I believe when I last tried long time ago it would not work that way but
> now it does, so (3) is basically an enabler of (2better). As far as I
> can tell (2better) does not make use of range information but I didn't
> really now why it works the way it does using (3).
>
>
> Now for
>
> unsigned long bar(unsigned int a)
> {
> if (a > 3 && a < INT_MAX - 100)
> return (unsigned long)(a + 1) + ULONG_MAX;
Here we can't perform the add in the narrower type. But
we do not want, generally, (T)(a + CST) -> (T)A + CST
when T is a widening conversion. So for this case it makes
sense to restrict it to the case where we can combine the
constants (but I wonder if this needs to be done in a pass
like reassoc to be effective).
So - which case is it? IIRC we want to handle small signed
constants but the code can end up unsigned. For the
above we could write (unsigned long)((int)a + 1 - 1) and thus
sign-extend? Or even avoid this if we know the range.
That is, it becomes the first case again (operation performed
in a smaller type).
> }
>
> we have
>
> Folded into: _2 = a_7 + 1;
> Folding statement: _3 = (long unsigned int) _2;
> Folded into: _3 = (long unsigned int) _2;
> Folding statement: _6 = _3 + 18446744073709551615;
> (1) Applying pattern match.pd:2060, gimple-match.c:36437
>
> in vrp1.
>
> This is not being handled by (2alt) but by (1). The same applies for
> long/int instead of ulong/uint except that we don't need range
> information there.
>
> Now, when removing (2) from my patch I run into nasty out-of-mem errors
> during bootstrap which I didn't find the reason for yet. I thought of
> something like two match.md patterns that keep reverting the other's
> result but I think we have a recursion depth check in place there.
>
> > Also do
> >
> > if ((did_replace || fold_all_stmts)
> > && fold_stmt (...))
> > {
> > }
> >
> > to avoid extra work when folding does nothing.
>
> Does this work like this? We want to mark the statement as modified even
> if folding wasn't successful, I guess, and the gimple_set_modified ()
> call should not be that expensive? Could do a separate
>
> if (fold_all_stmts && fold_stmt (...))
Sure, I was asking for a way to make fold_stmt conditional, it's of course
not 100% as simple as I quoted.
Richard.
> of course.
>
> Regards
> Robin
>