On Fri, Aug 16, 2019 at 2:17 PM Robin Dapp <rd...@linux.ibm.com> wrote: > > > So - what are you really after? (sorry if I don't remeber, testcase(s) > > are missing > > from this patch) > > > > To me it seems that 1) loses information if A + CST was done in a signed > > type > > and we know that overflow doesn't happen because of that. For the reverse > > transformation we don't. Btw, if you make A == A' + CST' then > > you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled > > so why do you need both transforms again? > > As far as I know the second transformation resulted from some suggestion > to split what was the status back then. Some things have changed in > match.pd so I checked again: > > What I want to achieve in the end is getting rid of > add r1,-1 > extend r1 > add r1,1. > emitted via doloop. > > We have > (1) (T)(A + CST1) + CST2 -> (T)(A) + (T)CST1 + CST2
Here + CST2 isn't part of the transform - is it just to restrict the "first half" to a specific subset of cases? > (2) (T)(A) + (T)CST1 + CST2 -> (T)(A) + (T)(CST1 + CST2) Likewise for (T)(A). > (2better) (A +- CST1) +- CST2 -> A + CST3 (the one Marc mentioned) But that's (1) with A == (T)(A' + CST1) > (3) (T)(A) + CST -> T(A + CST) > > Considering > unsigned long foo(unsigned int a) > { > if (a > 3 && a < INT_MAX - 100) > return (unsigned long)(a - 1) + 1; So you want to either compute everything in unsigned long or everything in unsigned int. IMHO unsigned int is preferable. So that asks for (T)A + 1 -> (T)(A + 1) if T is wider. We are already performing the reverse of T is narrower, (T)(A + 1) -> (T)A + 1 (IIRC). > }. > > This results in s390 assembly > ahi %r2,-1 > llgfr %r2,%r2 > aghi %r2,1 > br %r14. > > With the second transform it becomes just > br %r14 > due to > > (3) Applying pattern match.pd:2091, gimple-match.c:36515 > Matching expression match.pd:111, gimple-match.c:65 > (2better) Applying pattern match.pd:1980, gimple-match.c:1011 > > in evrp. > > I believe when I last tried long time ago it would not work that way but > now it does, so (3) is basically an enabler of (2better). As far as I > can tell (2better) does not make use of range information but I didn't > really now why it works the way it does using (3). > > > Now for > > unsigned long bar(unsigned int a) > { > if (a > 3 && a < INT_MAX - 100) > return (unsigned long)(a + 1) + ULONG_MAX; Here we can't perform the add in the narrower type. But we do not want, generally, (T)(a + CST) -> (T)A + CST when T is a widening conversion. So for this case it makes sense to restrict it to the case where we can combine the constants (but I wonder if this needs to be done in a pass like reassoc to be effective). So - which case is it? IIRC we want to handle small signed constants but the code can end up unsigned. For the above we could write (unsigned long)((int)a + 1 - 1) and thus sign-extend? Or even avoid this if we know the range. That is, it becomes the first case again (operation performed in a smaller type). > } > > we have > > Folded into: _2 = a_7 + 1; > Folding statement: _3 = (long unsigned int) _2; > Folded into: _3 = (long unsigned int) _2; > Folding statement: _6 = _3 + 18446744073709551615; > (1) Applying pattern match.pd:2060, gimple-match.c:36437 > > in vrp1. > > This is not being handled by (2alt) but by (1). The same applies for > long/int instead of ulong/uint except that we don't need range > information there. > > Now, when removing (2) from my patch I run into nasty out-of-mem errors > during bootstrap which I didn't find the reason for yet. I thought of > something like two match.md patterns that keep reverting the other's > result but I think we have a recursion depth check in place there. > > > Also do > > > > if ((did_replace || fold_all_stmts) > > && fold_stmt (...)) > > { > > } > > > > to avoid extra work when folding does nothing. > > Does this work like this? We want to mark the statement as modified even > if folding wasn't successful, I guess, and the gimple_set_modified () > call should not be that expensive? Could do a separate > > if (fold_all_stmts && fold_stmt (...)) Sure, I was asking for a way to make fold_stmt conditional, it's of course not 100% as simple as I quoted. Richard. > of course. > > Regards > Robin >