On 24/03/2021 13:46, Richard Biener wrote:
On Wed, 24 Mar 2021, Stam Markianos-Wright wrote:
Hi all,
This patch resolves bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974
This is achieved by forcing a re-calculation of *stmt_vectype_out if an
incompatible combination of TYPE_VECTOR_SUBPARTS is detected, but with an
extra introduced max_nunits ceiling.
I am not 100% sure if this is the best way to go about fixing this, because
this is my first look at the vectorizer and I lack knowledge of the wider
context, so do let me know if you see a better way to do this!
I have added the previously ICE-ing reproducer as a new test.
This is compiled as "g++ -Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" for
GCC11 and "g++ -Ofast -march=armv8.2-a+sve" for GCC10.
(the non-fdisable-tree-fre4 version has gone latent on GCC11)
Bootstrapped and reg-tested on aarch64-linux-gnu.
Also reg-tested on aarch64-none-elf.
I don't think this is going to work well given uses will expect
a vector type that's consistent here.
I think giving up is for the moment the best choice, thus replacing
the assert with vectorization failure.
In the end we shouldn't require those nunits vectypes to be
separately computed - we compute the vector type of the defs
anyway and in case they're invariant the vectorizable_* function
either can deal with the type mix or not anyway.
Yea good point! I agree and after all we are very close to releases now ;)
I've attached the patch that just do the graceful vectorization failure
and add a slightly better test now. Re-tested as previously with no
issues ofc.
gcc-10.patch is what I'd backport to GCC10 (the only difference between
that and gcc-11.patch is that one compiles with `-fdisable-tree-fre4`
and the other without it).
Ok to push this to the GCC11 branch and backport to the GCC10 branch?
Cheers :D
Stam
That said, the goal should be to simplify things here.
Richard.
gcc/ChangeLog:
* tree-vect-stmts.c (get_vectype_for_scalar_type): Add new
parameter to core function and add new function overload.
(vect_get_vector_types_for_stmt): Add re-calculation logic.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/sve/pr96974.C: New test.
diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
new file mode 100644
index 00000000000..363241d18df
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 -fdump-tree-slp-details" } */
+
+float a;
+int
+b ()
+{ return __builtin_lrintf(a); }
+
+struct c {
+ float d;
+ c() {
+ for (int e = 0; e < 9; e++)
+ coeffs[e] = d ? b() : 0;
+ }
+ int coeffs[10];
+} f;
+
+/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector subparts between" "slp1" } } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index d791d3a4720..4c01e82ff39 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -12148,8 +12148,12 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
}
}
- gcc_assert (multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
- TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)));
+ if (!multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
+ TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)))
+ return opt_result::failure_at (stmt,
+ "Not vectorized: Incompatible number "
+ "of vector subparts between %T and %T\n",
+ nunits_vectype, *stmt_vectype_out);
if (dump_enabled_p ())
{
diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
new file mode 100644
index 00000000000..2023c55e3e6
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8.2-a+sve -fdump-tree-slp-details" } */
+
+float a;
+int
+b ()
+{ return __builtin_lrintf(a); }
+
+struct c {
+ float d;
+ c() {
+ for (int e = 0; e < 9; e++)
+ coeffs[e] = d ? b() : 0;
+ }
+ int coeffs[10];
+} f;
+
+/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector subparts between" "slp1" } } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index c2d1f39fe0f..6418edb5204 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -12249,8 +12249,12 @@ vect_get_vector_types_for_stmt (stmt_vec_info stmt_info,
}
}
- gcc_assert (multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
- TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)));
+ if (!multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
+ TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)))
+ return opt_result::failure_at (stmt,
+ "Not vectorized: Incompatible number "
+ "of vector subparts between %T and %T\n",
+ nunits_vectype, *stmt_vectype_out);
if (dump_enabled_p ())
{