Richard Biener <richard.guent...@gmail.com> writes:
> On Tue, Feb 4, 2020 at 6:44 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Richard Sandiford <richard.sandif...@arm.com> writes:
>> > [...]
>> >> I'm not sure given the issues you've introduced if I could actually
>> >> fill out the matrix of answers without more underlying information.
>> >> ie, when can we get symbols without source level decls,
>> >> anchors+interposition issues, etc.
>> >
>> > OK.  In that case, I wonder whether it would be safer to have a
>> > fourth state on top of the three above:
>> >
>> >   - known distance apart
>> >   - independent
>> >   - known distance apart or independent
>> >   - don't know
>> >
>> > with "don't know" being anything that involves bare symbols?
>> >
>> > Richard
>>
>> How does this look?  Tested on aarch64-linux-gnu and
>> x86_64-linux-gnu.
>>
>> Full description from scratch:
>>
>> memrefs_conflict_p has a slightly odd structure.  It first checks
>> whether two addresses based on SYMBOL_REFs refer to the same object,
>> with a tristate result:
>>
>>       int cmp = compare_base_symbol_refs (x,y);
>>
>> If the addresses do refer to the same object, we can use offset-based checks:
>>
>>       /* If both decls are the same, decide by offsets.  */
>>       if (cmp == 1)
>>         return offset_overlap_p (c, xsize, ysize);
>>
>> But then, apart from the special case of forced address alignment,
>> we use an offset-based check even if we don't know whether the
>> addresses refer to the same object:
>>
>>       /* Assume a potential overlap for symbolic addresses that went
>>          through alignment adjustments (i.e., that have negative
>>          sizes), because we can't know how far they are from each
>>          other.  */
>>       if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>>         return -1;
>>       /* If decls are different or we know by offsets that there is no 
>> overlap,
>>          we win.  */
>>       if (!cmp || !offset_overlap_p (c, xsize, ysize))
>>         return 0;
>>
>> This somewhat contradicts:
>>
>>   /* In general we assume that memory locations pointed to by different 
>> labels
>>      may overlap in undefined ways.  */
>
> Sorry for not chiming in earlier but isn't the bug that
>
>
>
>> at the end of compare_base_symbol_refs.  In other words, we're taking -1
>> to mean that either (a) the symbols are equal (via aliasing) or (b) the
>> references access non-overlapping objects.
>>
>> But even assuming that's true for normal symbols, it doesn't cope
>> correctly with section anchors.  If a symbol X at ANCHOR+OFFSET
>> is preemptible, either (a) X = ANDHOR+OFFSET or (b) X and ANCHOR
>> reference non-overlapping objects.
>>
>> And an offset-based comparison makes no sense for an anchor symbol
>> vs. a bare symbol with no decl.  If the bare symbol is allowed to
>> alias other symbols then it can surely alias any symbol in the
>> anchor's block, so there are multiple anchor offsets that might
>> induce an alias.
>
> But then isn't the simple fix to honor the -1 and do
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 3794f9b6a9e..bf13d37c0f7 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -2490,9 +2490,8 @@ memrefs_conflict_p (poly_int64 xsize, rtx x,
> poly_int64 ysize, rtx y,
>          other.  */
>        if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
>         return -1;
> -      /* If decls are different or we know by offsets that there is no 
> overlap,
> -        we win.  */
> -      if (!cmp || !offset_overlap_p (c, xsize, ysize))
> +      /* If decls are different, we win.  */
> +      if (cmp == 0)
>         return 0;
>        /* Decls may or may not be different and offsets overlap....*/
>        return -1;
>
> ?

The code was deliberately written this way from the ouset though
(rather than it ending up like this through many cuts).  It was
added in g:54363f8a92920f5559c83ddd53e480a27205e6b7:

2015-12-08  Jan Hubicka  <hubi...@ucw.cz>

        PR ipa/61886
        PR middle-end/25140
        * tree-ssa-alias.c (ptr_deref_may_alias_decl_p): Use compare_base_decls
        (nonoverlapping_component_refs_of_decl_p): Update sanity check.
        (decl_refs_may_alias_p): Use compare_base_decls.
        * alias.c: Include cgraph.h
        (get_alias_set): Add cut-off for recursion.
        (rtx_equal_for_memref_p): Use rtx_equal_for_memref_p.
        (compare_base_decls): New function.
        (base_alias_check): Likewise.
        (memrefs_conflict_p): Likewise.
        (nonoverlapping_memrefs_p): Likewise.
        * alias.h (compare_base_decls): Declare.

which included:

-  if (rtx_equal_for_memref_p (x, y))
+  if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
+    {
+      tree x_decl = SYMBOL_REF_DECL (x);
+      tree y_decl = SYMBOL_REF_DECL (y);
+      int cmp;
+
+      if (!x_decl || !y_decl)
+       {
+         /* Label and normal symbol are never the same. */
+         if (x_decl != y_decl)
+           return 0;
+         return offset_overlap_p (c, xsize, ysize);
+       }
+      else
+        cmp = compare_base_decls (x_decl, y_decl);
+
+      /* If both decls are the same, decide by offsets.  */
+      if (cmp == 1)
+        return offset_overlap_p (c, xsize, ysize);
+      /* If decls are different or we know by offsets that there is no overlap,
+        we win.  */
+      if (!cmp || !offset_overlap_p (c, xsize, ysize))
+       return 0;
+      /* Decls may or may not be different and offsets overlap....*/
+      return -1;
+    }
+  else if (rtx_equal_for_memref_p (x, y))

The only significant change since them was to add compare_base_symbol_refs
(g:73e48cb322152bf504ced8694fa748544ecaa6eb):

   if (GET_CODE (x) == SYMBOL_REF && GET_CODE (y) == SYMBOL_REF)
     {
-      tree x_decl = SYMBOL_REF_DECL (x);
-      tree y_decl = SYMBOL_REF_DECL (y);
-      int cmp;
-
-      if (!x_decl || !y_decl)
-       {
-         /* Label and normal symbol are never the same. */
-         if (x_decl != y_decl)
-           return 0;
-         return offset_overlap_p (c, xsize, ysize);
-       }
-      else
-        cmp = compare_base_decls (x_decl, y_decl);
+      int cmp = compare_base_symbol_refs (x,y);

>> This patch therefore replaces the current tristate:
>>
>>   - known equal
>>   - known independent (two accesses can't alias)
>>   - equal or independent
>>
>> with:
>>
>>   - known distance apart
>>   - known independent (two accesses can't alias)
>>   - known distance apart or independent
>>   - don't know
>>
>> For safety, the patch puts all bare symbols in the "don't know"
>> category.  If that turns out to be too conservative, we at least
>> need that behaviour for combinations involving a bare symbol
>> and a section anchor.
>
> This sounds complicated (for this stage).  Do you have any statistics as
> to how it affects actual alias queries (thus outcome in {true,...}_dependence)
> when you do the "simple" fix?

For a stage3 build of gcc on aarch64-linux-gnu I get:

   ("positive" == conflict, "negative" == no conflict)

   16191  cmp ==  1,    disjoint offsets :  true negative
   19698  cmp ==  1, overlapping offsets :  true positive
   79363  cmp == -1,    disjoint offsets :  true negative
    6965  cmp == -1,    disjoint offsets : false negative <== bug
      48  cmp == -1, overlapping offsets :  true positive
     928  cmp == -1, overlapping offsets : false positive <== missed opt
  123193  total queries

where the cmp == -1 cases are divided into true and false results
according to whether we get the same answer when taking the (hopefully)
correct offset into account.  cmp == 0 and what would be cmp == -2 never
occured.

So it looks like we're relying on the cmp == -1 offset_overlap_p
check to get true "no conflict" results for ~64% of all queries
(or ~83% of all true "no conflict" results).  I expect this is
very dependent on the fact that the port uses section anchors.

The number of buggy cases seems surprisingly high, but I've tried
to re-check it a couple of times and it looks to be accurate.
They come from cases where we compare things like:

x: (symbol_ref:DI ("*.LANCHOR0") [flags 0x182])
y: (symbol_ref:DI ("_ZL23yy_last_accepting_state") [flags 0x82] <var_decl ... 
yy_last_accepting_state>)
c: -48

where yy_last_accepting_state is at offset 48 from the anchor.
I haven't checked where the buggy queries are coming from though;
might be debug-related and so hard to spot.

Thanks,
Richard

Reply via email to