https://gcc.gnu.org/g:99eef0cfa56573c32b9c0a1e43519ee4300ac63f

commit r15-6279-g99eef0cfa56573c32b9c0a1e43519ee4300ac63f
Author: Robin Dapp <rd...@ventanamicro.com>
Date:   Fri Sep 6 16:04:03 2024 +0200

    vect: Do not try to duplicate_and_interleave one-element mode.
    
    PR112694 shows that we try to create sub-vectors of single-element
    vectors because can_duplicate_and_interleave_p returns true.
    The problem resurfaced in PR116611.
    
    This patch makes can_duplicate_and_interleave_p return false
    if count / nvectors > 0 and removes the corresponding check in the riscv
    backend.
    
    This partially gets rid of the FAIL in slp-19a.c.  At least when built
    with cost model we don't have LOAD_LANES anymore.  Without cost model,
    as in the test suite, we choose a different path and still end up with
    LOAD_LANES.
    
    Bootstrapped and regtested on x86 and power10, regtested on
    rv64gcv_zvfh_zvbb.  Still waiting for the aarch64 results.
    
    Regards
     Robin
    
    gcc/ChangeLog:
    
            PR target/112694
            PR target/116611.
    
            * config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early
            return.
            * tree-vect-slp.cc (can_duplicate_and_interleave_p): Return
            false when we cannot create sub-elements.

Diff:
---
 gcc/config/riscv/riscv-v.cc | 9 ---------
 gcc/tree-vect-slp.cc        | 3 +++
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 417c36a7587c..b0de4c52b83c 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4011,15 +4011,6 @@ expand_vec_perm_const (machine_mode vmode, machine_mode 
op_mode, rtx target,
      mask to do the iteration loop control. Just disable it directly.  */
   if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL)
     return false;
-  /* FIXME: Explicitly disable VLA interleave SLP vectorization when we
-     may encounter ICE for poly size (1, 1) vectors in loop vectorizer.
-     Ideally, middle-end loop vectorizer should be able to disable it
-     itself, We can remove the codes here when middle-end code is able
-     to disable VLA SLP vectorization for poly size (1, 1) VF.  */
-  if (!BYTES_PER_RISCV_VECTOR.is_constant ()
-      && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL,
-                  poly_int64 (16, 16)))
-    return false;
 
   struct expand_vec_perm_d d;
 
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9ad95104ec7d..7bad268d406a 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -490,6 +490,9 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned 
int count,
       if (!multiple_p (elt_bytes, 2, &elt_bytes))
        return false;
       nvectors *= 2;
+      /* We need to be able to fuse COUNT / NVECTORS elements together.  */
+      if (!multiple_p (count, nvectors))
+       return false;
     }
 }

Reply via email to