On March 8, 2019 5:57:35 PM GMT+01:00, Martin Jambor <mjam...@suse.cz> wrote:
>Hi,
>
>BTW, we have dropped the gcc-patches from the CC quite some emails ago
>:-/
>
>On Fri, Mar 08 2019, Richard Biener wrote:
>> yOn Thu, 7 Mar 2019, Martin Jambor wrote:
>>
>>> Hello again,
>>> 
>>> On Thu, Mar 07 2019, Martin Jambor wrote:
>>> > Hi,
>>> >
>>> > On Thu, Mar 07 2019, Richard Biener wrote:
>>> >>
>>> >> So possibly we should restrict total scalarization to the case
>>> >> where all accesses are layout compatible (that is,
>>> >> types_compatible_p)?
>>> >>
>>> >
>>> > That would be the patch below (currently undergoing testing).  It
>is
>>> > slightly more strict than mine but on the three testcases in
>question,
>>> > it gives the same results too.  People memcpying their aggregates
>>> > around, even when not doing anything silly like storing stuff in
>>> > padding, will find their code slower but arguably closer to what
>they
>>> > wanted.
>>> 
>>> So, testing revealed that (unlike my previous version) this breaks
>two
>>> guality tests where we depend on total scalarization to convey to
>the
>>> debugger what is in bits of an aggregate.
>>> 
>>> In gcc.dg/guality/pr54970.c, the array a is no longer totally
>scalarized
>>> because it is memcpied into and gdb no longer knows what constant
>lurks
>>> in a[0], which is otherwise unused.  Probably not a big deal.
>>> 
>>> In gcc.dg/guality/pr59776.c, where the ain type and function are:
>>> 
>>>      struct S { float f, g; };
>>>     
>>>      __attribute__((noinline, noclone)) void
>>>      foo (struct S *p)
>>>      {
>>>        struct S s1, s2;                      /* { dg-final {
>gdb-test pr59776.c:17 "s1.f" "5.0" } } */
>>>        s1 = *p;                              /* { dg-final {
>gdb-test pr59776.c:17 "s1.g" "6.0" } } */
>>>        s2 = s1;                              /* { dg-final {
>gdb-test pr59776.c:17 "s2.f" "0.0" } } */
>>>        *(int *) &s2.f = 0;                   /* { dg-final {
>gdb-test pr59776.c:17 "s2.g" "6.0" } } */
>>>        asm volatile (NOP : : : "memory");    /* { dg-final {
>gdb-test pr59776.c:20 "s1.f" "5.0" } } */
>>>        asm volatile (NOP : : : "memory");    /* { dg-final {
>gdb-test pr59776.c:20 "s1.g" "6.0" } } */
>>>        s2 = s1;                              /* { dg-final {
>gdb-test pr59776.c:20 "s2.f" "5.0" } } */
>>>        asm volatile (NOP : : : "memory");    /* { dg-final {
>gdb-test pr59776.c:20 "s2.g" "6.0" } } */
>>>        asm volatile (NOP : : : "memory");
>>>      }
>>>   
>>> the zeroing through integer type caused that s2 variable is no
>longer
>>> totally scalarized and thus somehow opaque for the debugger.  I
>don't
>>> think this particular test is a big concern either, but such scalar
>type
>>> casted accesses were exactly the reason why I initially decided
>against
>>> completely disabling total scalarization when I see a
>types-incompatible
>>> access.
>>> 
>>> In any event, although I would still slightly prefer my patch from
>>> yesterday (or at least tolerating scalar type-incompatible
>accesses), I
>>> am also happy with the one from today, provided I have permission to
>>> XFAIL the guality tests.
>>
>> I guess I'd play safe at this point and do less adjustments.  The
>> second patch looks nicer, with
>>
>>  static bool
>> -contains_vce_or_bfcref_p (const_tree ref)
>> +contains_vce_or_bfcref_p (const_tree ref, bool *type_changing_p =
>NULL)
>
>OK
>
>>
>> (a default for type_chagning_p) there would be even less hunks.  Is
>> it functionally equivalent to the first variant when you change
>> the types_compatible_p back to the TYPE_MAIN_VARIANT compare?
>> That is, is this the only real difference?
>
>I probably don't understand the question because I'd say that the
>second
>guality failure shows there is a difference between being careful we
>just don't loose any data in padding and disabling total scalarization
>altogether.
>
>To give a non-guality example, the following will regress with the
>yesterday's variant - compared to current trunk:
>
>  struct S {unsigned a,b;};
>
>  void foo (struct S *in, struct S *out)
>  {
>    struct S s;
>    s = *in;
>    *(int *)&s.b = -1;
>    *out = s;
>  }
>
>Currently s is totally scalarized away, but if we "restrict total
>scalarization to the case where all accesses are layout compatible," it
>of course isn't and s survives all the way to optimized dump, whereas
>currently we eliminate it even in early SRA.
>
>Even the assembly output changes from:
>
>       movl    (%rdi), %eax
>       movl    %eax, (%rsi)
>       movl    $-1, 4(%rsi)
>       ret
>
>to a bit unexpected:
>
>       movabsq $-4294967296, %rax
>       orq     (%rdi), %rax
>       movq    %rax, (%rsi)
>       ret
>
>So I guess now I am even more convinced about what I wrote yesterday,
>i.e. that if we should at least allow the scalar incompatible types.
>The reason why this is safe is that when a scalar access is not a leaf
>of an access tree, scalarization of that portion of the aggregate fails
>(and thus total scalarization of the entire aggregate fails which we
>notice because that flag is &= propagated up in the access tree).
>
>Therefore, for total scalarization to proceed, a scalar access in a
>weird type must either exactly match a field/element of the declared
>type of the variable or entirely fit into some padding.  And in the
>latter case it is recorded appropriately and data will be copied in/out
>carefully.
>
>Patch doing that is below (currently undergoing bootstrap/testing).
>Or we can just focus on the padding which is the only thing known to
>cause problems and go with the Wednesday version of the patch.

The patch below is OK. 

Richard. 

>Martin
>
>
>2019-03-08  Martin Jambor  <mjam...@suse.cz>
>
>       PR tree-optimization/85762
>       PR tree-optimization/87008
>       PR tree-optimization/85459
>       * tree-sra.c (contains_vce_or_bfcref_p): New parameter, set the bool
>       it points to if there is a type changing MEM_REF.  Adjust all callers.
>       (build_accesses_from_assign): Disable total scalarization if
>       contains_vce_or_bfcref_p returns true through the new parameter, for
>       both rhs and lhs.
>
>       testsuite/
>       * g++.dg/tree-ssa/pr87008.C: New test.
>       * gcc.dg/guality/pr54970.c: Xfail tests querying a[0] everywhere.
>---
> gcc/testsuite/g++.dg/tree-ssa/pr87008.C | 17 ++++++++++++
> gcc/testsuite/gcc.dg/guality/pr54970.c  |  6 ++---
> gcc/tree-sra.c                          | 36 ++++++++++++++++++-------
> 3 files changed, 47 insertions(+), 12 deletions(-)
> create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr87008.C
>
>diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
>b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
>new file mode 100644
>index 00000000000..eef521f9ad5
>--- /dev/null
>+++ b/gcc/testsuite/g++.dg/tree-ssa/pr87008.C
>@@ -0,0 +1,17 @@
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fdump-tree-optimized" } */
>+
>+extern void dontcallthis();
>+
>+struct A { long a, b; };
>+struct B : A {};
>+template<class T>void cp(T&a,T const&b){a=b;}
>+long f(B x){
>+  B y; cp<A>(y,x);
>+  B z; cp<A>(z,x);
>+  if (y.a - z.a)
>+    dontcallthis ();
>+  return 0;
>+}
>+
>+/* { dg-final { scan-tree-dump-not "dontcallthis" "optimized" } } */
>diff --git a/gcc/testsuite/gcc.dg/guality/pr54970.c
>b/gcc/testsuite/gcc.dg/guality/pr54970.c
>index 5d32af07c32..2e0bc5784a9 100644
>--- a/gcc/testsuite/gcc.dg/guality/pr54970.c
>+++ b/gcc/testsuite/gcc.dg/guality/pr54970.c
>@@ -8,17 +8,17 @@
> int
> main ()
> {
>-  int a[] = { 1, 2, 3 };      /* { dg-final { gdb-test .+4 "a\[0\]" "1" } }
>*/
>+  int a[] = { 1, 2, 3 };      /* { dg-final { gdb-test .+4 "a\[0\]" "1" {
>xfail { *-*-* } } } } */
>   int *p = a + 2;             /* { dg-final { gdb-test .+3 "a\[1\]" "2" } } */
>   int *q = a + 1;             /* { dg-final { gdb-test .+2 "a\[2\]" "3" } } */
>                               /* { dg-final { gdb-test .+1 "*p" "3" } } */
>   asm volatile (NOP);         /* { dg-final { gdb-test . "*q" "2" } } */
>-  *p += 10;                   /* { dg-final { gdb-test .+4 "a\[0\]" "1" } } */
>+  *p += 10;                   /* { dg-final { gdb-test .+4 "a\[0\]" "1" { 
>xfail {
>*-*-* } } } } */
>                               /* { dg-final { gdb-test .+3 "a\[1\]" "2" } } */
>                               /* { dg-final { gdb-test .+2 "a\[2\]" "13" } } 
> */
>                               /* { dg-final { gdb-test .+1 "*p" "13" } } */
>   asm volatile (NOP);         /* { dg-final { gdb-test . "*q" "2" } } */
>-  *q += 10;                   /* { dg-final { gdb-test .+4 "a\[0\]" "1" } } */
>+  *q += 10;                   /* { dg-final { gdb-test .+4 "a\[0\]" "1" { 
>xfail {
>*-*-* } } } } */
>                               /* { dg-final { gdb-test .+3 "a\[1\]" "12" } } 
> */
>                               /* { dg-final { gdb-test .+2 "a\[2\]" "13" } } 
> */
>                               /* { dg-final { gdb-test .+1 "*p" "13" } } */
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index eeef31ba496..ca3858d5fc7 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -1150,29 +1150,36 @@ contains_view_convert_expr_p (const_tree ref)
>   return false;
> }
> 
>-/* Return true if REF contains a VIEW_CONVERT_EXPR or a MEM_REF that
>performs
>-   type conversion or a COMPONENT_REF with a bit-field field
>declaration.  */
>+/* Return true if REF contains a VIEW_CONVERT_EXPR or a COMPONENT_REF
>with a
>+   bit-field field declaration.  If TYPE_CHANGING_P is non-NULL, set
>the bool
>+   it points to will be set if REF contains any of the above or a
>MEM_REF
>+   expression that effectively performs type conversion.  */
> 
> static bool
>-contains_vce_or_bfcref_p (const_tree ref)
>+contains_vce_or_bfcref_p (const_tree ref, bool *type_changing_p =
>NULL)
> {
>   while (handled_component_p (ref))
>     {
>       if (TREE_CODE (ref) == VIEW_CONVERT_EXPR
>         || (TREE_CODE (ref) == COMPONENT_REF
>             && DECL_BIT_FIELD (TREE_OPERAND (ref, 1))))
>-      return true;
>+      {
>+        if (type_changing_p)
>+          *type_changing_p = true;
>+        return true;
>+      }
>       ref = TREE_OPERAND (ref, 0);
>     }
> 
>-  if (TREE_CODE (ref) != MEM_REF
>+  if (!type_changing_p
>+      || TREE_CODE (ref) != MEM_REF
>       || TREE_CODE (TREE_OPERAND (ref, 0)) != ADDR_EXPR)
>     return false;
> 
>   tree mem = TREE_OPERAND (TREE_OPERAND (ref, 0), 0);
>   if (TYPE_MAIN_VARIANT (TREE_TYPE (ref))
>       != TYPE_MAIN_VARIANT (TREE_TYPE (mem)))
>-    return true;
>+    *type_changing_p = true;
> 
>   return false;
> }
>@@ -1368,15 +1375,26 @@ build_accesses_from_assign (gimple *stmt)
>       lacc->grp_assignment_write = 1;
>       if (storage_order_barrier_p (rhs))
>       lacc->grp_unscalarizable_region = 1;
>+
>+      if (should_scalarize_away_bitmap && !is_gimple_reg_type
>(lacc->type))
>+      {
>+        bool type_changing_p = false;
>+        contains_vce_or_bfcref_p (lhs, &type_changing_p);
>+        if (type_changing_p)
>+          bitmap_set_bit (cannot_scalarize_away_bitmap,
>+                          DECL_UID (lacc->base));
>+      }
>     }
> 
>   if (racc)
>     {
>       racc->grp_assignment_read = 1;
>-      if (should_scalarize_away_bitmap && !gimple_has_volatile_ops
>(stmt)
>-        && !is_gimple_reg_type (racc->type))
>+      if (should_scalarize_away_bitmap && !is_gimple_reg_type
>(racc->type))
>       {
>-        if (contains_vce_or_bfcref_p (rhs))
>+        bool type_changing_p = false;
>+        contains_vce_or_bfcref_p (rhs, &type_changing_p);
>+
>+        if (type_changing_p || gimple_has_volatile_ops (stmt))
>           bitmap_set_bit (cannot_scalarize_away_bitmap,
>                           DECL_UID (racc->base));
>         else

Reply via email to