On 24/03/2021 13:46, Richard Biener wrote:
On Wed, 24 Mar 2021, Stam Markianos-Wright wrote:

Hi all,

This patch resolves bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974

This is achieved by forcing a re-calculation of *stmt_vectype_out if an
incompatible combination of TYPE_VECTOR_SUBPARTS is detected, but with an
extra introduced max_nunits ceiling.

I am not 100% sure if this is the best way to go about fixing this, because
this is my first look at the vectorizer and I lack knowledge of the wider
context, so do let me know if you see a better way to do this!

I have added the previously ICE-ing reproducer as a new test.

This is compiled as "g++ -Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" for
GCC11 and "g++ -Ofast -march=armv8.2-a+sve" for GCC10.

(the non-fdisable-tree-fre4 version has gone latent on GCC11)

Bootstrapped and reg-tested on aarch64-linux-gnu.
Also reg-tested on aarch64-none-elf.

I don't think this is going to work well given uses will expect
a vector type that's consistent here.

I think giving up is for the moment the best choice, thus replacing
the assert with vectorization failure.

In the end we shouldn't require those nunits vectypes to be
separately computed - we compute the vector type of the defs
anyway and in case they're invariant the vectorizable_* function
either can deal with the type mix or not anyway.


Yea good point! I agree and after all we are very close to releases now ;)

I've attached the patch that just do the graceful vectorization failure and add a slightly better test now. Re-tested as previously with no issues ofc.

gcc-10.patch is what I'd backport to GCC10 (the only difference between that and gcc-11.patch is that one compiles with `-fdisable-tree-fre4` and the other without it).

Ok to push this to the GCC11 branch and backport to the GCC10 branch?

Cheers :D
Stam

That said, the goal should be to simplify things here.

Richard.


gcc/ChangeLog:

         * tree-vect-stmts.c (get_vectype_for_scalar_type): Add new
         parameter to core function and add new function overload.
         (vect_get_vector_types_for_stmt): Add re-calculation logic.

gcc/testsuite/ChangeLog:

         * g++.target/aarch64/sve/pr96974.C: New test.



diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
new file mode 100644
index 00000000000..363241d18df
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 -fdump-tree-slp-details" } */
+
+float a;
+int
+b ()
+{ return __builtin_lrintf(a); }
+
+struct c {
+  float d;
+    c() {
+      for (int e = 0; e < 9; e++)
+	coeffs[e] = d ? b() : 0;
+    }
+    int coeffs[10];
+} f;
+
+/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector subparts between" "slp1" } } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index d791d3a4720..4c01e82ff39 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -12148,8 +12148,12 @@ vect_get_vector_types_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info,
 	}
     }
 
-  gcc_assert (multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
-			  TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)));
+  if (!multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
+		   TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)))
+    return opt_result::failure_at (stmt,
+				   "Not vectorized: Incompatible number "
+				   "of vector subparts between %T and %T\n",
+				   nunits_vectype, *stmt_vectype_out);
 
   if (dump_enabled_p ())
     {
diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
new file mode 100644
index 00000000000..2023c55e3e6
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-Ofast -march=armv8.2-a+sve -fdump-tree-slp-details" } */
+
+float a;
+int
+b ()
+{ return __builtin_lrintf(a); }
+
+struct c {
+  float d;
+    c() {
+      for (int e = 0; e < 9; e++)
+	coeffs[e] = d ? b() : 0;
+    }
+    int coeffs[10];
+} f;
+
+/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of vector subparts between" "slp1" } } */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index c2d1f39fe0f..6418edb5204 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -12249,8 +12249,12 @@ vect_get_vector_types_for_stmt (stmt_vec_info stmt_info,
 	}
     }
 
-  gcc_assert (multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
-			  TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)));
+  if (!multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype),
+		   TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)))
+    return opt_result::failure_at (stmt,
+				   "Not vectorized: Incompatible number "
+				   "of vector subparts between %T and %T\n",
+				   nunits_vectype, *stmt_vectype_out);
 
   if (dump_enabled_p ())
     {

Reply via email to