http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58742

--- Comment #25 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Marc Glisse from comment #24)
> Thank you.
> Sadly, for the example in comment #15, this is not quite enough, I need to
> add forwprop+ccp right before the VRP1 pass (and then the range check is
> eliminated, the vectorizer works and perfs are the same as without range
> checking).

VERSION=0 and VERSION=1 are the same speed for me now, VERSION=2 is a lot
slower still.

> Indeed, we learn that size is (start+4000000)-start quite late
> (need to inline, look through mem_refs, etc -> FRE2) so the previous
> forwprop pass is too early.

Yeah, the issue is that while FRE does some expression simplification it
doesn't wire into a common gimple pattern matcher (something I'd like to
fix for 4.10).  That is, the simplification forwprop performs should be
done by FRE already.  See tree-ssa-sccvn.c:simplify_binary_expression.

> VRP2 is too late if we hope to vectorize, and in
> any case it fails to remove the range checks, because it is confused by the
> new shape of the loops (possibly related to PR 25643, or not). The VRP2
> failure looks funny with these consecutive lines:
> 
>   # ivtmp.80_92 = PHI <ivtmp.80_53(9), ivtmp.80_83(8)>
>   # RANGE [10101, 989898] NONZERO 0x000000000000fffff
>   _23 = ivtmp.80_92;
>   if (ivtmp.80_92 > 999999)
> 
> Really, we don't know that the comparison returns false?

Well, _23 is simply dead at this point and VRP computed _92 to be
varying.

> 
> For the overflow in sizeof(*p) * sz, would it make sense to have the
> front-end generate, when it sees p+sz: if((long)sz>LONG_MAX/sizeof(*p))
> __builtin_unreachable() (or abort or a sanitizer call depending on options),
> and a similar check for large negative values? It feels very heavy for such
> a common operation, but if the FE is the only one with the information, I am
> not sure how else to pass it down to gimple.

>From the no-undefined-overflow branch I'd take the idea of adding op
variants with known no overflow.  That is, add MULTNV_EXPR, PLUSNV_EXPR,
MINUSNV_EXPR that can be used on unsigned types, too (you'd of course
have to define what overflow means there - if a - b does not overflow
then a + (-b) will - negate of x will always overflow if x is not zero).

The idea of no-undefined-overflow branch was to make all ops wrapping
by default (even signed type arithmetic) and make frontends explicitely
use non-overflowing ops when language semantics says they are not
overflowing.

> I might file a low priority enhancement PR about extending reassoc to
> pointers, that would still cover some cases (and it wouldn't make the
> forwprop transformation useless because of single-use restrictions).

Reply via email to