http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58742
--- Comment #27 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 3 Feb 2014, glisse at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58742 > > --- Comment #26 from Marc Glisse <glisse at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #25) > > VERSION=0 and VERSION=1 are the same speed for me now, > > They aren't quite for me (2.5 vs 2.7) but > > > VERSION=2 is a lot slower still. > > that's the part I am concerned with here. > > > Yeah, the issue is that while FRE does some expression simplification it > > doesn't wire into a common gimple pattern matcher (something I'd like to > > fix for 4.10). That is, the simplification forwprop performs should be > > done by FRE already. See tree-ssa-sccvn.c:simplify_binary_expression. > > Ah, ok, that makes sense. I assume it would also have basic CCP-like > functionality (forwprop can create constants but doesn't always fold the > resulting constant operations). Looking forward to that! > > > > VRP2 is too late if we hope to vectorize, and in > > > any case it fails to remove the range checks, because it is confused by > > > the > > > new shape of the loops (possibly related to PR 25643, or not). The VRP2 > > > failure looks funny with these consecutive lines: > > > > > > # ivtmp.80_92 = PHI <ivtmp.80_53(9), ivtmp.80_83(8)> > > > # RANGE [10101, 989898] NONZERO 0x000000000000fffff > > > _23 = ivtmp.80_92; > > > if (ivtmp.80_92 > 999999) > > > > > > Really, we don't know that the comparison returns false? > > > > Well, _23 is simply dead at this point and VRP computed _92 to be > > varying. > > Yes. I just meant that, as a hack, for 2 SSA_NAME defined in the same BB where > one is a copy of the other, we could merge their range info (in both > directions) and it might in this special case work around the fact that VRP2 > is > confused by the loop. But that would be too fragile and hackish. > > > From the no-undefined-overflow branch I'd take the idea of adding op > > variants with known no overflow. That is, add MULTNV_EXPR, PLUSNV_EXPR, > > MINUSNV_EXPR that can be used on unsigned types, too (you'd of course > > have to define what overflow means there - if a - b does not overflow > > then a + (-b) will - negate of x will always overflow if x is not zero). > > Ah, yes, I'd forgotten about those. I always wondered if it is better to have > many different tree codes or a single one with "options". Like MULT_EXPR with > a > parameter saying what happens on overflow: undefined, saturate, wrap, other > (seems hard to handle "jump to this location" in the same). Or COMPARISON_EXPR > with several bools telling what the return value is if a<b, a==b, a>b, one is > NaN, and if it can raise exceptions (we don't have the corresponding 32 tree > codes). Or the 5 DIV_EXPR variants (counting only integers). I guess it > doesn't > really matter. It matters for convenience with existing code like fold-const.c which takes decomposed expression trees. You'd need to add a bunch of flags there and pass them through appropriately. Much easier to encode it in enum tree_code directly. Richard.