http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58742
--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Marc Glisse from comment #24) > Thank you. > Sadly, for the example in comment #15, this is not quite enough, I need to > add forwprop+ccp right before the VRP1 pass (and then the range check is > eliminated, the vectorizer works and perfs are the same as without range > checking). VERSION=0 and VERSION=1 are the same speed for me now, VERSION=2 is a lot slower still. > Indeed, we learn that size is (start+4000000)-start quite late > (need to inline, look through mem_refs, etc -> FRE2) so the previous > forwprop pass is too early. Yeah, the issue is that while FRE does some expression simplification it doesn't wire into a common gimple pattern matcher (something I'd like to fix for 4.10). That is, the simplification forwprop performs should be done by FRE already. See tree-ssa-sccvn.c:simplify_binary_expression. > VRP2 is too late if we hope to vectorize, and in > any case it fails to remove the range checks, because it is confused by the > new shape of the loops (possibly related to PR 25643, or not). The VRP2 > failure looks funny with these consecutive lines: > > # ivtmp.80_92 = PHI <ivtmp.80_53(9), ivtmp.80_83(8)> > # RANGE [10101, 989898] NONZERO 0x000000000000fffff > _23 = ivtmp.80_92; > if (ivtmp.80_92 > 999999) > > Really, we don't know that the comparison returns false? Well, _23 is simply dead at this point and VRP computed _92 to be varying. > > For the overflow in sizeof(*p) * sz, would it make sense to have the > front-end generate, when it sees p+sz: if((long)sz>LONG_MAX/sizeof(*p)) > __builtin_unreachable() (or abort or a sanitizer call depending on options), > and a similar check for large negative values? It feels very heavy for such > a common operation, but if the FE is the only one with the information, I am > not sure how else to pass it down to gimple. >From the no-undefined-overflow branch I'd take the idea of adding op variants with known no overflow. That is, add MULTNV_EXPR, PLUSNV_EXPR, MINUSNV_EXPR that can be used on unsigned types, too (you'd of course have to define what overflow means there - if a - b does not overflow then a + (-b) will - negate of x will always overflow if x is not zero). The idea of no-undefined-overflow branch was to make all ops wrapping by default (even signed type arithmetic) and make frontends explicitely use non-overflowing ops when language semantics says they are not overflowing. > I might file a low priority enhancement PR about extending reassoc to > pointers, that would still cover some cases (and it wouldn't make the > forwprop transformation useless because of single-use restrictions).