On Fri, Feb 24, 2017 at 12:34 PM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Feb 24, 2017 at 11:53 AM, Bin Cheng <bin.ch...@arm.com> wrote: >> Hi, >> As analyzed in PR69564, inefficient code for runtime alias check is >> generated in benchmark >> scimark2. It is suspected vectorized loop doesn't run enough iterations to >> cover the loss >> of the inefficient code. This patch improves runtime alias check for >> (unsigned) wrapping >> types. >> >> Originally, vectorizer checks if below two ranges overlap with each other >> range a: [min_a, max_a] ; abs(length_a) = max_a - min_a >> range_b: [min_b, max_b] ; abs(length_b) = max_b - min_b >> with condition like: >> max_a <= min_b || max_b <= min_a >> >> With knowledge of length of the two ranges, this patch checks overlap with >> condition like: >> (min_b - min_a) >= abs(len_a) && (min_b - min_a) <= (- abs(len_b)) >> It's better because common sub expressions in b/a can be folded. >> >> Note there is an implicit condition for above statements that length of >> range needs to be >> no larger than middle value of the unsigned type. This is always true since >> object never >> spans over half address space in GCC. >> >> As mentioned, this is only done for case with wrap type and also constant >> segment length >> for both a and b, which is the most common case. It is possible to improve >> signed cases >> or compilation time unknown segment length cases too, but may not worth the >> effort. >> >> Unfortunately, test case is hard to create for such code optimization. >> Bootstrap and test on x86)64 and AArch64. Is it OK? > > The patch mixes the above together with better folding (cancelling > equal offset, decomposing > pointer-plus). Please separate those. > > Note that seg_len is always positive and the abs(len_N) in the > comments doesn't make > much sense to me. > > Meanwhile the simplification (ignoring overflow issues) can go, > assuming equal lenghts > > mina + len <= minb || minb + len <= mina > > -> mina - minb <= -len || minb - mina <= -len > -> minb - mina >= len || mina - minb >= len > > and as we know len >= 0 this is the same as > > abs (minb - mina) >= len > > and thus even simpler than your variant. But maybe I'm missing something? > > As for the overflow issues, you do all computation in sizetype and > thus guarantee > unsigned compares, but I think to do these kind of "simplifications" you have > to > use signed quantities which opens up the possibility of objects crossing the > signed-address-space wrapping point and the simplification to break apart.
Hi Richard, I might understand your concern about overflow now. The validity of simplification is based on unsigned wrapping behavior, rather than "mis-use" of infinite precision for unsigned type. Take example of two simple ranges: range_a [6, 12] and range_b [2, 8] and below simplified condition: (min_b - min_a) >= len_a && (min_b - min_a) <= (0 - len_b) For signed type, we have 2 - 6 == -4 > 2 - 8 == -6; For unsigned wrapping type, we also have 2 - 6 == 0x80000000 + -4 > 2 - 8 == 0x80000000 + -6; So actually I think the simplification stands for both signed type and unsigned wrapping type. I rewrite below comment in other to better explain that simplified condition equals to the original one: /* Intersection between ranges [min_a, max_a] and [min_b, max_b] can be checked by below condition: max_a <= min_b || max_b <= min_a ;cond_A Or (min_a + len_a) <= min_b || (min_b + len_b) <= min_a For type which wraps on overflow, it equals to: (min_b - min_a) >= len_a && (min_b - min_a) <= (0 - len_b) ;cond_B Proof: 1) In case min_b >= min_a, i.e, (min_b - min_a) doesn't underflow. 1.1) When min_b >= max_a (min_b - min_a) >= len_a is true. (min_b - min_a) <= (0 - len_b) is true because (0 - len_b) wraps to a large number. Note (min_b - min_a) can only take the maxmum vaue (0 - len_b) if (min_a == 0 && max_b == MAX). Thus both cond_A and cond_B evaluate to TRUE. 1.2) when min_b >= min_a && min_b < max_a (min_b - min_a) >= len_a is false. Thus both cond_A and cond_B evaluate to FALSE. 2) In case min_a > min_b, i.e, (min_b - min_a) underflow. 2.1) when min_a > min_b && min_a < max_b (min_b - min_a) <= (0 - len_b) == (min_b - max_b) is false. Thus both cond_A and cond_B evaluate to FALSE. 2.2) When min_a >= max_b (min_b - min_a) >= len_a is true because (min_b - min_a) can only take the maximum value when (min_b == 0 && max_a == MAX). (min_b - min_a) <= (0 - len_b) == (min_b - max_b) is true. Thus both cond_A and cond_B evaluate to TRUE. We iterate all possible cases for two ranges and for each case cond_A and cond_B take the same value. So the two conditions equal to each other. */ Do I understand the issue correctly? Thanks, bin