http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58742
--- Comment #26 from Marc Glisse <glisse at gcc dot gnu.org> --- (In reply to Richard Biener from comment #25) > VERSION=0 and VERSION=1 are the same speed for me now, They aren't quite for me (2.5 vs 2.7) but > VERSION=2 is a lot slower still. that's the part I am concerned with here. > Yeah, the issue is that while FRE does some expression simplification it > doesn't wire into a common gimple pattern matcher (something I'd like to > fix for 4.10). That is, the simplification forwprop performs should be > done by FRE already. See tree-ssa-sccvn.c:simplify_binary_expression. Ah, ok, that makes sense. I assume it would also have basic CCP-like functionality (forwprop can create constants but doesn't always fold the resulting constant operations). Looking forward to that! > > VRP2 is too late if we hope to vectorize, and in > > any case it fails to remove the range checks, because it is confused by the > > new shape of the loops (possibly related to PR 25643, or not). The VRP2 > > failure looks funny with these consecutive lines: > > > > # ivtmp.80_92 = PHI <ivtmp.80_53(9), ivtmp.80_83(8)> > > # RANGE [10101, 989898] NONZERO 0x000000000000fffff > > _23 = ivtmp.80_92; > > if (ivtmp.80_92 > 999999) > > > > Really, we don't know that the comparison returns false? > > Well, _23 is simply dead at this point and VRP computed _92 to be > varying. Yes. I just meant that, as a hack, for 2 SSA_NAME defined in the same BB where one is a copy of the other, we could merge their range info (in both directions) and it might in this special case work around the fact that VRP2 is confused by the loop. But that would be too fragile and hackish. > From the no-undefined-overflow branch I'd take the idea of adding op > variants with known no overflow. That is, add MULTNV_EXPR, PLUSNV_EXPR, > MINUSNV_EXPR that can be used on unsigned types, too (you'd of course > have to define what overflow means there - if a - b does not overflow > then a + (-b) will - negate of x will always overflow if x is not zero). Ah, yes, I'd forgotten about those. I always wondered if it is better to have many different tree codes or a single one with "options". Like MULT_EXPR with a parameter saying what happens on overflow: undefined, saturate, wrap, other (seems hard to handle "jump to this location" in the same). Or COMPARISON_EXPR with several bools telling what the return value is if a<b, a==b, a>b, one is NaN, and if it can raise exceptions (we don't have the corresponding 32 tree codes). Or the 5 DIV_EXPR variants (counting only integers). I guess it doesn't really matter.