Hi, Current sra use UNITS_PER_WORD to define max scalarization size, but for targets like x86 it allows operations on larger size, so the components like vector variables in an aggregate can be larger than just UNITS_PER_WORD. Use MOVE_MAX instead of UNITS_PER_WORD to allow sra for aggregates with vector components.
Bootstrapped/regtested on x86-64-pc-linux-gnu. OK for trunk? gcc/ChangeLog: PR target/112824 * tree-sra.cc (sra_get_max_scalarization_size): Use MOVE_MAX instead of UNITS_PER_WORD to define max_scalarization_size. gcc/testsuite/ChangeLog: PR target/112824 * g++.target/i386/pr112824-2.C: New test. --- gcc/testsuite/g++.target/i386/pr112824-2.C | 10 ++++++++++ gcc/tree-sra.cc | 2 +- 2 files changed, 11 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.target/i386/pr112824-2.C diff --git a/gcc/testsuite/g++.target/i386/pr112824-2.C b/gcc/testsuite/g++.target/i386/pr112824-2.C new file mode 100644 index 00000000000..036a47b7280 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr112824-2.C @@ -0,0 +1,10 @@ +/* PR target/112824 */ +/* { dg-do compile } */ +/* { dg-options "-std=c++23 -O3 -march=skylake-avx512 -mprefer-vector-width=512" } */ +/* { dg-final { scan-assembler-not "vmov.*\[ \\t\]+\[^\n\]*%rsp" } } */ + +#include "pr112824-1.C" + +void prod(Dual<Dual<double,8>,2> &c, const Dual<Dual<double,8>,2> &a, const Dual<Dual<double,8>,2>&b){ + c = a*b; +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 4b6daf77284..23236fc6537 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -3760,7 +3760,7 @@ sra_get_max_scalarization_size (void) /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_<...>, fall back to a target default. */ unsigned HOST_WIDE_INT max_scalarization_size - = get_move_ratio (optimize_speed_p) * UNITS_PER_WORD; + = get_move_ratio (optimize_speed_p) * MOVE_MAX; if (optimize_speed_p) { -- 2.31.1