On Fri, Feb 24, 2017 at 11:53 AM, Bin Cheng <bin.ch...@arm.com> wrote:
> Hi,
> As analyzed in PR69564, inefficient code for runtime alias check is generated 
> in benchmark
> scimark2.  It is suspected vectorized loop doesn't run enough iterations to 
> cover the loss
> of the inefficient code.  This patch improves runtime alias check for 
> (unsigned) wrapping
> types.
>
> Originally, vectorizer checks if below two ranges overlap with each other
>     range a: [min_a, max_a]  ; abs(length_a) = max_a - min_a
>     range_b: [min_b, max_b]  ; abs(length_b) = max_b - min_b
> with condition like:
>     max_a <= min_b || max_b <= min_a
>
> With knowledge of length of the two ranges, this patch checks overlap with 
> condition like:
>     (min_b - min_a) >= abs(len_a) && (min_b - min_a) <= (- abs(len_b))
> It's better because common sub expressions in b/a can be folded.
>
> Note there is an implicit condition for above statements that length of range 
> needs to be
> no larger than middle value of the unsigned type.  This is always true since 
> object never
> spans over half address space in GCC.
>
> As mentioned, this is only done for case with wrap type and also constant 
> segment length
> for both a and b, which is the most common case.  It is possible to improve 
> signed cases
> or compilation time unknown segment length cases too, but may not worth the 
> effort.
>
> Unfortunately, test case is hard to create for such code optimization.
> Bootstrap and test on x86)64 and AArch64.  Is it OK?

The patch mixes the above together with better folding (cancelling
equal offset, decomposing
pointer-plus).  Please separate those.

Note that seg_len is always positive and the abs(len_N) in the
comments doesn't make
much sense to me.

Meanwhile the simplification (ignoring overflow issues) can go,
assuming equal lenghts

    mina + len <= minb || minb + len <= mina

-> mina - minb <= -len || minb - mina <= -len
-> minb - mina >= len || mina - minb >= len

and as we know len >= 0 this is the same as

   abs (minb - mina) >= len

and thus even simpler than your variant.  But maybe I'm missing something?

As for the overflow issues, you do all computation in sizetype and
thus guarantee
unsigned compares, but I think to do these kind of "simplifications" you have to
use signed quantities which opens up the possibility of objects crossing the
signed-address-space wrapping point and the simplification to break apart.

That is, your

+  tree type = TREE_TYPE (addr_base_a);
+  if (POINTER_TYPE_P (type))
+    type = sizetype;
...
+  if (!TYPE_UNSIGNED (type)
+      || !TYPE_OVERFLOW_WRAPS (type)

is useless (always false, you always get type == sizetype).

Btw, the same overflow issues plague "removal" of common DR_OFFSET and friends
iff the original condition, max_a <= min_b || max_b >= min_a is
correct - we are using
relational compares on pointers to possibly different objects here...
(and all pointers
are TYPE_UNSIGNED) - and you later interpret those as signed comparisons to be
able to shift offsets from one side to the other.

So for the "real" check you have to use

  abs ((ssizetype)(minb - mina)) >= len

Thanks,
Richard.

> Thanks,
> bin
>
> 2017-02-23  Bin Cheng  <bin.ch...@arm.com>
>
>         PR tree-optimization/69564
>         * tree-vect-loop-manip.c (tree-affine.h): Include.
>         (create_intersect_range_checks): For types with wrap behavior, check
>         range overlap for [min_a, max_a]/[min_b, max_b] with condition like:
>         "(min_b - min_a) >= abs(len_a) && (min_b - min_a) <= (-abs(len_b))".
>

Reply via email to