Sorry for the late reply, but: Richard Biener <rguent...@suse.de> writes: > On Mon, 7 Nov 2016, Richard Biener wrote: > >> >> Currently we force peeling for gaps whenever element overrun can occur >> but for aligned accesses we know that the loads won't trap and thus >> we can avoid this. >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu (I expect >> some testsuite fallout here so didn't bother to invent a new testcase). >> >> Just in case somebody thinks the overrun is a bad idea in general >> (even when not trapping). Like for ASAN or valgrind. > > This is what I applied. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > Richard. [...] > diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c > index 15aec21..c29e73d 100644 > --- a/gcc/tree-vect-stmts.c > +++ b/gcc/tree-vect-stmts.c > @@ -1789,6 +1794,10 @@ get_group_load_store_type (gimple *stmt, tree vectype, > bool slp, > /* If there is a gap at the end of the group then these optimizations > would access excess elements in the last iteration. */ > bool would_overrun_p = (gap != 0); > + /* If the access is aligned an overrun is fine. */ > + if (would_overrun_p > + && aligned_access_p (STMT_VINFO_DATA_REF (stmt_info))) > + would_overrun_p = false; > if (!STMT_VINFO_STRIDED_P (stmt_info) > && (can_overrun_p || !would_overrun_p) > && compare_step_with_zero (stmt) > 0)
...is this right for all cases? I think it only looks for single-vector alignment, but the gap can in principle be vector-sized or larger, at least for load-lanes. E.g. say we have a 128-bit vector of doubles in a group of size 4 and a gap of 2 or 3. Even if the access itself is aligned, the group spans two vectors and we have no guarantee that the second one is mapped. I haven't been able to come up with a testcase though. We seem to be overly conservative when computing alignments. Thanks, Richard