On Sun, 21 Jul 2024, Richard Biener wrote:

> As in other places we have to be careful to use FP modes to represent
> the underlying bit representation of an object.  With x87 floating-point
> types there are no load or store instructions that preserve this and
> XFmode can have padding.
> 
> When SRA faces the situation that a field is accessed with multiple
> effective types as happens for example for unions it generally
> choses an integer type if available.  But in the case in the PR
> there's an aggregate type or a floating-point type only and we end
> up chosing the register type.
> 
> SRA deals with similar situations for bit-precision integer types
> and adjusts the replacement type to one covering the size of the
> object.  The following patch makes sure we do the same when the
> replacement has float mode and there were possibly two ways the
> object was accessed.  I've chosen to use bitwise_type_for_mode
> in this case as done for example by memcpy folding to avoid
> creating a unsigned:96 replacement type on i?86 where
> sizeof(long double) is 12.  This means we can fail to find an
> integer type for a replacement which slightly complicates the patch
> and it causes the testcase to no longer be SRAed on i?86.
> 
> Bootstrapped on x86_64-unknown-linux-gnu, there is some fallout in
> the testsuite I need to compare to a clean run.  Comments welcome.

So it seems that "failing" in analyze_access_subtree doesn't work,
but the other place that gets to see multiple accesses is just
sort_and_splice_var_accesses but a lot of accesses are created
after that, like by subaccess propagation and total scalarization.

It seems like SRA has no full view of the access trees it uses for
replacement?  I have an updated patch - see below, but that now
relies on bitwise_type_for_mode working but it will fail for those
late created accesses eventually.  An incremental

diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 07ad23f3595..1f6c93ed3c6 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -3042,6 +3042,7 @@ create_artificial_child_access (struct access 
*parent, struct access *model,
   access->grp_read = set_grp_read;
   access->grp_write = set_grp_write;
   access->reverse = model->reverse;
+  access->grp_same_access_path = model->grp_same_access_path;
 
   child = &parent->first_child;
   while (*child && (*child)->offset < new_offset)
@@ -3465,6 +3466,13 @@ create_total_access_and_reshape (struct access 
*parent, HOST_WIDE_INT pos,
 {
   struct access **p = ptr;
 
+  /* For aggregate copying we are possibly using an integer type for 
float
+     typed subfields.  Make sure there is one.  */
+  if (FLOAT_MODE_P (TYPE_MODE (type))
+      && !path_comparable_for_same_access (expr)
+      && !bitwise_type_for_mode (TYPE_MODE (type)))
+    return NULL;
+
   while (*p && (*p)->offset < pos + size)
     {
       if ((*p)->offset + (*p)->size > pos + size)

might fix this but then this does even more changes.

We can possibly do a "smaller" fix by adding yet another flag,
avoid_fp_type and only set that in sort_and_splice_var_accesses
where we can also fail and check that later instead of
grp_same_access_path.  That leaves some FP field uses around
but it's not that SRA has another load of issues with x87 FP
loads and stores.

Any preference?  Any suggestions?

Thanks,
Richard.



>From 41a119580adb8db28a03324f198ec7907d0c0db6 Mon Sep 17 00:00:00 2001
From: Richard Biener <rguent...@suse.de>
Date: Sun, 21 Jul 2024 10:45:10 +0200
Subject: [PATCH] tree-optimization/58416 - SRA wrt FP type replacements
To: gcc-patches@gcc.gnu.org

As in other places we have to be careful to use FP modes to represent
the underlying bit representation of an object.  With x87 floating-point
types there are no load or store instructions that preserve this and
XFmode can have padding.

When SRA faces the situation that a field is accessed with multiple
effective types as happens for example for unions it generally
choses an integer type if available.  But in the case in the PR
there's an aggregate type or a floating-point type only and we end
up chosing the register type.

SRA deals with similar situations for bit-precision integer types
and adjusts the replacement type to one covering the size of the
object.  The following patch makes sure we do the same when the
replacement has float mode and there were possibly two ways the
object was accessed.  I've chosen to use bitwise_type_for_mode
in this case as done for example by memcpy folding to avoid
creating a unsigned:96 replacement type on i?86 where
sizeof(long double) is 12.  This means we can fail to find an
integer type for a replacement which means we have to fail for
this case in sort_and_splice_var_accesses and it causes the testcase
to no longer be SRAed on i?86.

For gcc.dg/tree-ssa/sra-6.c there are now integer type accesses
created for the FP fields.

        PR tree-optimization/58416
        * tree-sra.cc (analyze_access_subtree): For FP mode replacements
        with multiple access paths use a bitwise type instead or fail
        if not available.
        (sort_and_splice_var_accesses): When for a FP mode we cannot
        find an integer replacement fail.

        * gcc.dg/torture/pr58416.c: New testcase.
        * gcc.dg/tree-ssa/sra-6.c: Adjust.
---
 gcc/testsuite/gcc.dg/torture/pr58416.c | 32 ++++++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/sra-6.c  |  4 +--
 gcc/tree-sra.cc                        | 41 +++++++++++++++++++++-----
 3 files changed, 68 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr58416.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c 
b/gcc/testsuite/gcc.dg/torture/pr58416.c
new file mode 100644
index 00000000000..0922b0e7089
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr58416.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+
+struct s {
+  char s[sizeof(long double)];
+};
+
+union u {
+  long double d;
+  struct s s;
+};
+
+int main()
+{
+  union u x = {0};
+#if __SIZEOF_LONG_DOUBLE__ == 16
+  x.s = (struct s){"xxxxxxxxxxxxxxxx"};
+#elif __SIZEOF_LONG_DOUBLE__ == 12
+  x.s = (struct s){"xxxxxxxxxxxx"};
+#elif __SIZEOF_LONG_DOUBLE__ == 8
+  x.s = (struct s){"xxxxxxxx"};
+#elif __SIZEOF_LONG_DOUBLE__ == 4
+  x.s = (struct s){"xxxx"};
+#endif
+
+  union u y = x;
+
+  for (unsigned char *p = (unsigned char *)&y + sizeof y;
+       p-- > (unsigned char *)&y;)
+    if (*p != (unsigned char)'x')
+      __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c
index 0dc8b3233c3..8e3c0f66d2f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c
@@ -30,9 +30,9 @@ void cow (int i)
 
 
 /* Suaccesses of b and c should have been created.  */
-/* { dg-final { scan-tree-dump "expr = b.d"  "esra"} } */
+/* { dg-final { scan-tree-dump "expr = MEM \[^\r\n\]*&b\\\]"  "esra"} } */
 /* { dg-final { scan-tree-dump "expr = b.i1"  "esra"} } */
-/* { dg-final { scan-tree-dump "expr = c.d"  "esra"} } */
+/* { dg-final { scan-tree-dump "expr = MEM \[^\r\n\]*&c\\\]"  "esra"} } */
 /* { dg-final { scan-tree-dump "expr = c.i1"  "esra"} } */
 
 /* There should be no reference to link_error.  */
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index 8040b0c5645..07ad23f3595 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -2470,6 +2470,26 @@ sort_and_splice_var_accesses (tree var)
          j++;
        }
 
+      /* Avoid a floating-point replacement when there's multiple
+        ways this field is accessed.   When a fitting integer type
+        is not available we have to fail here.  */
+      if (!grp_same_access_path
+         && FLOAT_MODE_P (TYPE_MODE (access->type))
+         && !bitwise_type_for_mode (TYPE_MODE (access->type)))
+       {
+         if (dump_file && (dump_flags & TDF_DETAILS))
+           {
+             fprintf (dump_file, "Failed to change the type of a "
+                      "replacement for ");
+             print_generic_expr (dump_file, access->base);
+             fprintf (dump_file, " offset: %u, size: %u ",
+                      (unsigned) access->offset,
+                      (unsigned) access->size);
+             fprintf (dump_file, " to an integer.\n");
+           }
+         unscalarizable_region = true;
+       }
+
       i = j;
 
       access->group_representative = access;
@@ -2868,19 +2888,26 @@ analyze_access_subtree (struct access *root, struct 
access *parent,
       /* Always create access replacements that cover the whole access.
          For integral types this means the precision has to match.
         Avoid assumptions based on the integral type kind, too.  */
-      if (INTEGRAL_TYPE_P (root->type)
-         && ((TREE_CODE (root->type) != INTEGER_TYPE
-              && TREE_CODE (root->type) != BITINT_TYPE)
-             || TYPE_PRECISION (root->type) != root->size)
-         /* But leave bitfield accesses alone.  */
-         && (TREE_CODE (root->expr) != COMPONENT_REF
-             || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
+      if ((INTEGRAL_TYPE_P (root->type)
+          && ((TREE_CODE (root->type) != INTEGER_TYPE
+               && TREE_CODE (root->type) != BITINT_TYPE)
+              || TYPE_PRECISION (root->type) != root->size)
+          /* But leave bitfield accesses alone.  */
+          && (TREE_CODE (root->expr) != COMPONENT_REF
+              || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
+         /* Avoid a floating-point replacement when there's multiple
+            ways this field is accessed.   On some targets this can
+            cause correctness issues, see PR58416.  */
+         || (FLOAT_MODE_P (TYPE_MODE (root->type))
+             && !root->grp_same_access_path))
        {
          tree rt = root->type;
          gcc_assert ((root->offset % BITS_PER_UNIT) == 0
                      && (root->size % BITS_PER_UNIT) == 0);
          if (TREE_CODE (root->type) == BITINT_TYPE)
            root->type = build_bitint_type (root->size, TYPE_UNSIGNED (rt));
+         else if (FLOAT_MODE_P (TYPE_MODE (root->type)))
+           root->type = bitwise_type_for_mode (TYPE_MODE (root->type));
          else
            root->type = build_nonstandard_integer_type (root->size,
                                                         TYPE_UNSIGNED (rt));
-- 
2.35.3


> Richard.
> 
>       PR tree-optimization/58416
>       * tree-sra.cc (analyze_access_subtree): For FP mode replacements
>       with multiple access paths use a bitwise type instead or fail
>       if not available.
> 
>       * gcc.dg/torture/pr58416.c: New testcase.
> ---
>  gcc/testsuite/gcc.dg/torture/pr58416.c | 32 ++++++++++++
>  gcc/tree-sra.cc                        | 72 ++++++++++++++++++--------
>  2 files changed, 83 insertions(+), 21 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr58416.c
> 
> diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c 
> b/gcc/testsuite/gcc.dg/torture/pr58416.c
> new file mode 100644
> index 00000000000..0922b0e7089
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr58416.c
> @@ -0,0 +1,32 @@
> +/* { dg-do run } */
> +
> +struct s {
> +  char s[sizeof(long double)];
> +};
> +
> +union u {
> +  long double d;
> +  struct s s;
> +};
> +
> +int main()
> +{
> +  union u x = {0};
> +#if __SIZEOF_LONG_DOUBLE__ == 16
> +  x.s = (struct s){"xxxxxxxxxxxxxxxx"};
> +#elif __SIZEOF_LONG_DOUBLE__ == 12
> +  x.s = (struct s){"xxxxxxxxxxxx"};
> +#elif __SIZEOF_LONG_DOUBLE__ == 8
> +  x.s = (struct s){"xxxxxxxx"};
> +#elif __SIZEOF_LONG_DOUBLE__ == 4
> +  x.s = (struct s){"xxxx"};
> +#endif
> +
> +  union u y = x;
> +
> +  for (unsigned char *p = (unsigned char *)&y + sizeof y;
> +       p-- > (unsigned char *)&y;)
> +    if (*p != (unsigned char)'x')
> +      __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
> index 8040b0c5645..bc9a7b3ee04 100644
> --- a/gcc/tree-sra.cc
> +++ b/gcc/tree-sra.cc
> @@ -2868,40 +2868,70 @@ analyze_access_subtree (struct access *root, struct 
> access *parent,
>        /* Always create access replacements that cover the whole access.
>           For integral types this means the precision has to match.
>        Avoid assumptions based on the integral type kind, too.  */
> -      if (INTEGRAL_TYPE_P (root->type)
> -       && ((TREE_CODE (root->type) != INTEGER_TYPE
> -            && TREE_CODE (root->type) != BITINT_TYPE)
> -           || TYPE_PRECISION (root->type) != root->size)
> -       /* But leave bitfield accesses alone.  */
> -       && (TREE_CODE (root->expr) != COMPONENT_REF
> -           || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
> +      if ((INTEGRAL_TYPE_P (root->type)
> +        && ((TREE_CODE (root->type) != INTEGER_TYPE
> +             && TREE_CODE (root->type) != BITINT_TYPE)
> +            || TYPE_PRECISION (root->type) != root->size)
> +        /* But leave bitfield accesses alone.  */
> +        && (TREE_CODE (root->expr) != COMPONENT_REF
> +            || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
> +       /* Avoid a floating-point replacement when there's multiple
> +          ways this field is accessed.   On some targets this can
> +          cause correctness issues, see PR58416.  */
> +       || (FLOAT_MODE_P (TYPE_MODE (root->type))
> +           && !root->grp_same_access_path))
>       {
>         tree rt = root->type;
>         gcc_assert ((root->offset % BITS_PER_UNIT) == 0
>                     && (root->size % BITS_PER_UNIT) == 0);
>         if (TREE_CODE (root->type) == BITINT_TYPE)
>           root->type = build_bitint_type (root->size, TYPE_UNSIGNED (rt));
> +       else if (FLOAT_MODE_P (TYPE_MODE (root->type)))
> +         {
> +           tree bt = bitwise_type_for_mode (TYPE_MODE (root->type));
> +           if (!bt)
> +             {
> +               if (dump_file && (dump_flags & TDF_DETAILS))
> +                 {
> +                   fprintf (dump_file, "Failed to change the type of a "
> +                            "replacement for ");
> +                   print_generic_expr (dump_file, root->base);
> +                   fprintf (dump_file, " offset: %u, size: %u ",
> +                            (unsigned) root->offset, (unsigned) root->size);
> +                   fprintf (dump_file, " to an integer.\n");
> +                 }
> +               allow_replacements = false;
> +             }
> +           else
> +             root->type = bt;
> +         }
>         else
>           root->type = build_nonstandard_integer_type (root->size,
>                                                        TYPE_UNSIGNED (rt));
> -       root->expr = build_ref_for_offset (UNKNOWN_LOCATION, root->base,
> -                                          root->offset, root->reverse,
> -                                          root->type, NULL, false);
> -
> -       if (dump_file && (dump_flags & TDF_DETAILS))
> +       if (allow_replacements)
>           {
> -           fprintf (dump_file, "Changing the type of a replacement for ");
> -           print_generic_expr (dump_file, root->base);
> -           fprintf (dump_file, " offset: %u, size: %u ",
> -                    (unsigned) root->offset, (unsigned) root->size);
> -           fprintf (dump_file, " to an integer.\n");
> +           root->expr = build_ref_for_offset (UNKNOWN_LOCATION, root->base,
> +                                              root->offset, root->reverse,
> +                                              root->type, NULL, false);
> +
> +           if (dump_file && (dump_flags & TDF_DETAILS))
> +             {
> +               fprintf (dump_file, "Changing the type of a replacement for 
> ");
> +               print_generic_expr (dump_file, root->base);
> +               fprintf (dump_file, " offset: %u, size: %u ",
> +                        (unsigned) root->offset, (unsigned) root->size);
> +               fprintf (dump_file, " to an integer.\n");
> +             }
>           }
>       }
>  
> -      root->grp_to_be_replaced = 1;
> -      root->replacement_decl = create_access_replacement (root);
> -      sth_created = true;
> -      hole = false;
> +      if (allow_replacements)
> +     {
> +       root->grp_to_be_replaced = 1;
> +       root->replacement_decl = create_access_replacement (root);
> +       sth_created = true;
> +       hole = false;
> +     }
>      }
>    else
>      {
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to