https://gcc.gnu.org/g:99eef0cfa56573c32b9c0a1e43519ee4300ac63f
commit r15-6279-g99eef0cfa56573c32b9c0a1e43519ee4300ac63f Author: Robin Dapp <rd...@ventanamicro.com> Date: Fri Sep 6 16:04:03 2024 +0200 vect: Do not try to duplicate_and_interleave one-element mode. PR112694 shows that we try to create sub-vectors of single-element vectors because can_duplicate_and_interleave_p returns true. The problem resurfaced in PR116611. This patch makes can_duplicate_and_interleave_p return false if count / nvectors > 0 and removes the corresponding check in the riscv backend. This partially gets rid of the FAIL in slp-19a.c. At least when built with cost model we don't have LOAD_LANES anymore. Without cost model, as in the test suite, we choose a different path and still end up with LOAD_LANES. Bootstrapped and regtested on x86 and power10, regtested on rv64gcv_zvfh_zvbb. Still waiting for the aarch64 results. Regards Robin gcc/ChangeLog: PR target/112694 PR target/116611. * config/riscv/riscv-v.cc (expand_vec_perm_const): Remove early return. * tree-vect-slp.cc (can_duplicate_and_interleave_p): Return false when we cannot create sub-elements. Diff: --- gcc/config/riscv/riscv-v.cc | 9 --------- gcc/tree-vect-slp.cc | 3 +++ 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 417c36a7587c..b0de4c52b83c 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -4011,15 +4011,6 @@ expand_vec_perm_const (machine_mode vmode, machine_mode op_mode, rtx target, mask to do the iteration loop control. Just disable it directly. */ if (GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL) return false; - /* FIXME: Explicitly disable VLA interleave SLP vectorization when we - may encounter ICE for poly size (1, 1) vectors in loop vectorizer. - Ideally, middle-end loop vectorizer should be able to disable it - itself, We can remove the codes here when middle-end code is able - to disable VLA SLP vectorization for poly size (1, 1) VF. */ - if (!BYTES_PER_RISCV_VECTOR.is_constant () - && maybe_lt (BYTES_PER_RISCV_VECTOR * TARGET_MAX_LMUL, - poly_int64 (16, 16))) - return false; struct expand_vec_perm_d d; diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 9ad95104ec7d..7bad268d406a 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -490,6 +490,9 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count, if (!multiple_p (elt_bytes, 2, &elt_bytes)) return false; nvectors *= 2; + /* We need to be able to fuse COUNT / NVECTORS elements together. */ + if (!multiple_p (count, nvectors)) + return false; } }