> So - what are you really after? (sorry if I don't remeber, testcase(s) > are missing > from this patch) > > To me it seems that 1) loses information if A + CST was done in a signed type > and we know that overflow doesn't happen because of that. For the reverse > transformation we don't. Btw, if you make A == A' + CST' then > you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled > so why do you need both transforms again?
As far as I know the second transformation resulted from some suggestion to split what was the status back then. Some things have changed in match.pd so I checked again: What I want to achieve in the end is getting rid of add r1,-1 extend r1 add r1,1. emitted via doloop. We have (1) (T)(A + CST1) + CST2 -> (T)(A) + (T)CST1 + CST2 (2) (T)(A) + (T)CST1 + CST2 -> (T)(A) + (T)(CST1 + CST2) (2better) (A +- CST1) +- CST2 -> A + CST3 (the one Marc mentioned) (3) (T)(A) + CST -> T(A + CST) Considering unsigned long foo(unsigned int a) { if (a > 3 && a < INT_MAX - 100) return (unsigned long)(a - 1) + 1; }. This results in s390 assembly ahi %r2,-1 llgfr %r2,%r2 aghi %r2,1 br %r14. With the second transform it becomes just br %r14 due to (3) Applying pattern match.pd:2091, gimple-match.c:36515 Matching expression match.pd:111, gimple-match.c:65 (2better) Applying pattern match.pd:1980, gimple-match.c:1011 in evrp. I believe when I last tried long time ago it would not work that way but now it does, so (3) is basically an enabler of (2better). As far as I can tell (2better) does not make use of range information but I didn't really now why it works the way it does using (3). Now for unsigned long bar(unsigned int a) { if (a > 3 && a < INT_MAX - 100) return (unsigned long)(a + 1) + ULONG_MAX; } we have Folded into: _2 = a_7 + 1; Folding statement: _3 = (long unsigned int) _2; Folded into: _3 = (long unsigned int) _2; Folding statement: _6 = _3 + 18446744073709551615; (1) Applying pattern match.pd:2060, gimple-match.c:36437 in vrp1. This is not being handled by (2alt) but by (1). The same applies for long/int instead of ulong/uint except that we don't need range information there. Now, when removing (2) from my patch I run into nasty out-of-mem errors during bootstrap which I didn't find the reason for yet. I thought of something like two match.md patterns that keep reverting the other's result but I think we have a recursion depth check in place there. > Also do > > if ((did_replace || fold_all_stmts) > && fold_stmt (...)) > { > } > > to avoid extra work when folding does nothing. Does this work like this? We want to mark the statement as modified even if folding wasn't successful, I guess, and the gimple_set_modified () call should not be that expensive? Could do a separate if (fold_all_stmts && fold_stmt (...)) of course. Regards Robin