On Sun, 21 Jul 2024, Richard Biener wrote: > As in other places we have to be careful to use FP modes to represent > the underlying bit representation of an object. With x87 floating-point > types there are no load or store instructions that preserve this and > XFmode can have padding. > > When SRA faces the situation that a field is accessed with multiple > effective types as happens for example for unions it generally > choses an integer type if available. But in the case in the PR > there's an aggregate type or a floating-point type only and we end > up chosing the register type. > > SRA deals with similar situations for bit-precision integer types > and adjusts the replacement type to one covering the size of the > object. The following patch makes sure we do the same when the > replacement has float mode and there were possibly two ways the > object was accessed. I've chosen to use bitwise_type_for_mode > in this case as done for example by memcpy folding to avoid > creating a unsigned:96 replacement type on i?86 where > sizeof(long double) is 12. This means we can fail to find an > integer type for a replacement which slightly complicates the patch > and it causes the testcase to no longer be SRAed on i?86. > > Bootstrapped on x86_64-unknown-linux-gnu, there is some fallout in > the testsuite I need to compare to a clean run. Comments welcome.
So it seems that "failing" in analyze_access_subtree doesn't work, but the other place that gets to see multiple accesses is just sort_and_splice_var_accesses but a lot of accesses are created after that, like by subaccess propagation and total scalarization. It seems like SRA has no full view of the access trees it uses for replacement? I have an updated patch - see below, but that now relies on bitwise_type_for_mode working but it will fail for those late created accesses eventually. An incremental diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 07ad23f3595..1f6c93ed3c6 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3042,6 +3042,7 @@ create_artificial_child_access (struct access *parent, struct access *model, access->grp_read = set_grp_read; access->grp_write = set_grp_write; access->reverse = model->reverse; + access->grp_same_access_path = model->grp_same_access_path; child = &parent->first_child; while (*child && (*child)->offset < new_offset) @@ -3465,6 +3466,13 @@ create_total_access_and_reshape (struct access *parent, HOST_WIDE_INT pos, { struct access **p = ptr; + /* For aggregate copying we are possibly using an integer type for float + typed subfields. Make sure there is one. */ + if (FLOAT_MODE_P (TYPE_MODE (type)) + && !path_comparable_for_same_access (expr) + && !bitwise_type_for_mode (TYPE_MODE (type))) + return NULL; + while (*p && (*p)->offset < pos + size) { if ((*p)->offset + (*p)->size > pos + size) might fix this but then this does even more changes. We can possibly do a "smaller" fix by adding yet another flag, avoid_fp_type and only set that in sort_and_splice_var_accesses where we can also fail and check that later instead of grp_same_access_path. That leaves some FP field uses around but it's not that SRA has another load of issues with x87 FP loads and stores. Any preference? Any suggestions? Thanks, Richard. >From 41a119580adb8db28a03324f198ec7907d0c0db6 Mon Sep 17 00:00:00 2001 From: Richard Biener <rguent...@suse.de> Date: Sun, 21 Jul 2024 10:45:10 +0200 Subject: [PATCH] tree-optimization/58416 - SRA wrt FP type replacements To: gcc-patches@gcc.gnu.org As in other places we have to be careful to use FP modes to represent the underlying bit representation of an object. With x87 floating-point types there are no load or store instructions that preserve this and XFmode can have padding. When SRA faces the situation that a field is accessed with multiple effective types as happens for example for unions it generally choses an integer type if available. But in the case in the PR there's an aggregate type or a floating-point type only and we end up chosing the register type. SRA deals with similar situations for bit-precision integer types and adjusts the replacement type to one covering the size of the object. The following patch makes sure we do the same when the replacement has float mode and there were possibly two ways the object was accessed. I've chosen to use bitwise_type_for_mode in this case as done for example by memcpy folding to avoid creating a unsigned:96 replacement type on i?86 where sizeof(long double) is 12. This means we can fail to find an integer type for a replacement which means we have to fail for this case in sort_and_splice_var_accesses and it causes the testcase to no longer be SRAed on i?86. For gcc.dg/tree-ssa/sra-6.c there are now integer type accesses created for the FP fields. PR tree-optimization/58416 * tree-sra.cc (analyze_access_subtree): For FP mode replacements with multiple access paths use a bitwise type instead or fail if not available. (sort_and_splice_var_accesses): When for a FP mode we cannot find an integer replacement fail. * gcc.dg/torture/pr58416.c: New testcase. * gcc.dg/tree-ssa/sra-6.c: Adjust. --- gcc/testsuite/gcc.dg/torture/pr58416.c | 32 ++++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/sra-6.c | 4 +-- gcc/tree-sra.cc | 41 +++++++++++++++++++++----- 3 files changed, 68 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr58416.c diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c b/gcc/testsuite/gcc.dg/torture/pr58416.c new file mode 100644 index 00000000000..0922b0e7089 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr58416.c @@ -0,0 +1,32 @@ +/* { dg-do run } */ + +struct s { + char s[sizeof(long double)]; +}; + +union u { + long double d; + struct s s; +}; + +int main() +{ + union u x = {0}; +#if __SIZEOF_LONG_DOUBLE__ == 16 + x.s = (struct s){"xxxxxxxxxxxxxxxx"}; +#elif __SIZEOF_LONG_DOUBLE__ == 12 + x.s = (struct s){"xxxxxxxxxxxx"}; +#elif __SIZEOF_LONG_DOUBLE__ == 8 + x.s = (struct s){"xxxxxxxx"}; +#elif __SIZEOF_LONG_DOUBLE__ == 4 + x.s = (struct s){"xxxx"}; +#endif + + union u y = x; + + for (unsigned char *p = (unsigned char *)&y + sizeof y; + p-- > (unsigned char *)&y;) + if (*p != (unsigned char)'x') + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c b/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c index 0dc8b3233c3..8e3c0f66d2f 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-6.c @@ -30,9 +30,9 @@ void cow (int i) /* Suaccesses of b and c should have been created. */ -/* { dg-final { scan-tree-dump "expr = b.d" "esra"} } */ +/* { dg-final { scan-tree-dump "expr = MEM \[^\r\n\]*&b\\\]" "esra"} } */ /* { dg-final { scan-tree-dump "expr = b.i1" "esra"} } */ -/* { dg-final { scan-tree-dump "expr = c.d" "esra"} } */ +/* { dg-final { scan-tree-dump "expr = MEM \[^\r\n\]*&c\\\]" "esra"} } */ /* { dg-final { scan-tree-dump "expr = c.i1" "esra"} } */ /* There should be no reference to link_error. */ diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 8040b0c5645..07ad23f3595 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -2470,6 +2470,26 @@ sort_and_splice_var_accesses (tree var) j++; } + /* Avoid a floating-point replacement when there's multiple + ways this field is accessed. When a fitting integer type + is not available we have to fail here. */ + if (!grp_same_access_path + && FLOAT_MODE_P (TYPE_MODE (access->type)) + && !bitwise_type_for_mode (TYPE_MODE (access->type))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Failed to change the type of a " + "replacement for "); + print_generic_expr (dump_file, access->base); + fprintf (dump_file, " offset: %u, size: %u ", + (unsigned) access->offset, + (unsigned) access->size); + fprintf (dump_file, " to an integer.\n"); + } + unscalarizable_region = true; + } + i = j; access->group_representative = access; @@ -2868,19 +2888,26 @@ analyze_access_subtree (struct access *root, struct access *parent, /* Always create access replacements that cover the whole access. For integral types this means the precision has to match. Avoid assumptions based on the integral type kind, too. */ - if (INTEGRAL_TYPE_P (root->type) - && ((TREE_CODE (root->type) != INTEGER_TYPE - && TREE_CODE (root->type) != BITINT_TYPE) - || TYPE_PRECISION (root->type) != root->size) - /* But leave bitfield accesses alone. */ - && (TREE_CODE (root->expr) != COMPONENT_REF - || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) + if ((INTEGRAL_TYPE_P (root->type) + && ((TREE_CODE (root->type) != INTEGER_TYPE + && TREE_CODE (root->type) != BITINT_TYPE) + || TYPE_PRECISION (root->type) != root->size) + /* But leave bitfield accesses alone. */ + && (TREE_CODE (root->expr) != COMPONENT_REF + || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) + /* Avoid a floating-point replacement when there's multiple + ways this field is accessed. On some targets this can + cause correctness issues, see PR58416. */ + || (FLOAT_MODE_P (TYPE_MODE (root->type)) + && !root->grp_same_access_path)) { tree rt = root->type; gcc_assert ((root->offset % BITS_PER_UNIT) == 0 && (root->size % BITS_PER_UNIT) == 0); if (TREE_CODE (root->type) == BITINT_TYPE) root->type = build_bitint_type (root->size, TYPE_UNSIGNED (rt)); + else if (FLOAT_MODE_P (TYPE_MODE (root->type))) + root->type = bitwise_type_for_mode (TYPE_MODE (root->type)); else root->type = build_nonstandard_integer_type (root->size, TYPE_UNSIGNED (rt)); -- 2.35.3 > Richard. > > PR tree-optimization/58416 > * tree-sra.cc (analyze_access_subtree): For FP mode replacements > with multiple access paths use a bitwise type instead or fail > if not available. > > * gcc.dg/torture/pr58416.c: New testcase. > --- > gcc/testsuite/gcc.dg/torture/pr58416.c | 32 ++++++++++++ > gcc/tree-sra.cc | 72 ++++++++++++++++++-------- > 2 files changed, 83 insertions(+), 21 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr58416.c > > diff --git a/gcc/testsuite/gcc.dg/torture/pr58416.c > b/gcc/testsuite/gcc.dg/torture/pr58416.c > new file mode 100644 > index 00000000000..0922b0e7089 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr58416.c > @@ -0,0 +1,32 @@ > +/* { dg-do run } */ > + > +struct s { > + char s[sizeof(long double)]; > +}; > + > +union u { > + long double d; > + struct s s; > +}; > + > +int main() > +{ > + union u x = {0}; > +#if __SIZEOF_LONG_DOUBLE__ == 16 > + x.s = (struct s){"xxxxxxxxxxxxxxxx"}; > +#elif __SIZEOF_LONG_DOUBLE__ == 12 > + x.s = (struct s){"xxxxxxxxxxxx"}; > +#elif __SIZEOF_LONG_DOUBLE__ == 8 > + x.s = (struct s){"xxxxxxxx"}; > +#elif __SIZEOF_LONG_DOUBLE__ == 4 > + x.s = (struct s){"xxxx"}; > +#endif > + > + union u y = x; > + > + for (unsigned char *p = (unsigned char *)&y + sizeof y; > + p-- > (unsigned char *)&y;) > + if (*p != (unsigned char)'x') > + __builtin_abort (); > + return 0; > +} > diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc > index 8040b0c5645..bc9a7b3ee04 100644 > --- a/gcc/tree-sra.cc > +++ b/gcc/tree-sra.cc > @@ -2868,40 +2868,70 @@ analyze_access_subtree (struct access *root, struct > access *parent, > /* Always create access replacements that cover the whole access. > For integral types this means the precision has to match. > Avoid assumptions based on the integral type kind, too. */ > - if (INTEGRAL_TYPE_P (root->type) > - && ((TREE_CODE (root->type) != INTEGER_TYPE > - && TREE_CODE (root->type) != BITINT_TYPE) > - || TYPE_PRECISION (root->type) != root->size) > - /* But leave bitfield accesses alone. */ > - && (TREE_CODE (root->expr) != COMPONENT_REF > - || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) > + if ((INTEGRAL_TYPE_P (root->type) > + && ((TREE_CODE (root->type) != INTEGER_TYPE > + && TREE_CODE (root->type) != BITINT_TYPE) > + || TYPE_PRECISION (root->type) != root->size) > + /* But leave bitfield accesses alone. */ > + && (TREE_CODE (root->expr) != COMPONENT_REF > + || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) > + /* Avoid a floating-point replacement when there's multiple > + ways this field is accessed. On some targets this can > + cause correctness issues, see PR58416. */ > + || (FLOAT_MODE_P (TYPE_MODE (root->type)) > + && !root->grp_same_access_path)) > { > tree rt = root->type; > gcc_assert ((root->offset % BITS_PER_UNIT) == 0 > && (root->size % BITS_PER_UNIT) == 0); > if (TREE_CODE (root->type) == BITINT_TYPE) > root->type = build_bitint_type (root->size, TYPE_UNSIGNED (rt)); > + else if (FLOAT_MODE_P (TYPE_MODE (root->type))) > + { > + tree bt = bitwise_type_for_mode (TYPE_MODE (root->type)); > + if (!bt) > + { > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Failed to change the type of a " > + "replacement for "); > + print_generic_expr (dump_file, root->base); > + fprintf (dump_file, " offset: %u, size: %u ", > + (unsigned) root->offset, (unsigned) root->size); > + fprintf (dump_file, " to an integer.\n"); > + } > + allow_replacements = false; > + } > + else > + root->type = bt; > + } > else > root->type = build_nonstandard_integer_type (root->size, > TYPE_UNSIGNED (rt)); > - root->expr = build_ref_for_offset (UNKNOWN_LOCATION, root->base, > - root->offset, root->reverse, > - root->type, NULL, false); > - > - if (dump_file && (dump_flags & TDF_DETAILS)) > + if (allow_replacements) > { > - fprintf (dump_file, "Changing the type of a replacement for "); > - print_generic_expr (dump_file, root->base); > - fprintf (dump_file, " offset: %u, size: %u ", > - (unsigned) root->offset, (unsigned) root->size); > - fprintf (dump_file, " to an integer.\n"); > + root->expr = build_ref_for_offset (UNKNOWN_LOCATION, root->base, > + root->offset, root->reverse, > + root->type, NULL, false); > + > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Changing the type of a replacement for > "); > + print_generic_expr (dump_file, root->base); > + fprintf (dump_file, " offset: %u, size: %u ", > + (unsigned) root->offset, (unsigned) root->size); > + fprintf (dump_file, " to an integer.\n"); > + } > } > } > > - root->grp_to_be_replaced = 1; > - root->replacement_decl = create_access_replacement (root); > - sth_created = true; > - hole = false; > + if (allow_replacements) > + { > + root->grp_to_be_replaced = 1; > + root->replacement_decl = create_access_replacement (root); > + sth_created = true; > + hole = false; > + } > } > else > { > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)