Hi,
PR112694 shows that we try to create sub-vectors of single-element
vectors because can_duplicate_and_interleave_p returns true.
The problem resurfaced in PR116611.
This patch makes can_duplicate_and_interleave_p return false
if count / nvectors > 0 and removes the corresponding check in the riscv
backend.
This partially gets rid of the FAIL in slp-19a.c. At least when built
with cost model we don't have LOAD_LANES anymore. Without cost model,
as in the test suite, we choose a different path and still end up with
LOAD_LANES.
Bootstrapped and regtested on x86 and power10, regtested on
rv64gcv_zvfh_zvbb. Still waiting for the aarch64 results.
Regards
Robin
gcc/ChangeLog:
PR target/112694
PR target/116611.
* config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
return.
* tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
false when we cannot create sub-elements.
---
gcc/config/riscv/riscv-v.cc | 9 ---------
gcc/tree-vect-slp.cc | 4 ++++
2 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 9b6c3a21e2d..5c5ed63d22e 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3709,15 +3709,6 @@ expand_vec_perm_const (machine_mode vmode, machine_mode
op_mode, rtx target,
mask to do the iteration loop control. Just disable it directly. */
if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL)
return false;
- /* FIXME: Explicitly disable VLA interleave SLP vectorization when we
- may encounter ICE for poly size (1, 1) vectors in loop vectorizer.
- Ideally, middle-end loop vectorizer should be able to disable it
- itself, We can remove the codes here when middle-end code is able
- to disable VLA SLP vectorization for poly size (1, 1) VF. */
- if (!BYTES_PER_RISCV_VECTOR.is_constant ()
- && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL,
- poly_int64 (16, 16)))
- return false;
struct expand_vec_perm_d d;
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 3d2973698e2..17b59870c69 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -434,6 +434,10 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned
int count,
unsigned int nvectors = 1;
for (;;)
{
+ /* We need to be able to to fuse COUNT / NVECTORS elements together,
+ so no point in continuing if there are none. */
+ if (nvectors > count)
+ return false;
scalar_int_mode int_mode;
poly_int64 elt_bits = elt_bytes * BITS_PER_UNIT;
if (int_mode_for_size (elt_bits, 1).exists (&int_mode))
--
2.46.0