https://gcc.gnu.org/g:0b27a7dd050262a7d64d87863201e4ebbde88386

commit r15-5117-g0b27a7dd050262a7d64d87863201e4ebbde88386
Author: Richard Biener <rguent...@suse.de>
Date:   Fri Nov 8 13:06:07 2024 +0100

    tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when 
considering gather
    
    The following treats both the same when considering to use gather or
    scatter for single-element interleaving accesses.
    
    This will cause
    
    FAIL: gcc.target/aarch64/sve/sve_iters_low_2.c scan-tree-dump-not vect 
"LOOP VECTORIZED"
    
    where we now vectorize the loop with VNx4QI, I'll leave it to ARM folks
    to investigate whether that's OK and to adjust the testcase or to see
    where to adjust things to make the testcase not vectorized again.  The
    original fix for which the testcase was introduced is still efffective.
    
            PR tree-optimization/117502
            * tree-vect-stmts.cc (get_group_load_store_type): Also consider
            VMAT_STRIDED_SLP when checking to use gather/scatter for
            single-element interleaving access.
            * tree-vect-loop.cc (update_epilogue_loop_vinfo): 
STMT_VINFO_STRIDED_P
            can be classified as VMAT_GATHER_SCATTER, so update DR_REF for
            those as well.

Diff:
---
 gcc/tree-vect-loop.cc  | 1 +
 gcc/tree-vect-stmts.cc | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 6cfce5aa7e1e..f50ee2e958ef 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12295,6 +12295,7 @@ update_epilogue_loop_vinfo (class loop *epilogue, tree 
advance)
         refs that get_load_store_type classified as VMAT_GATHER_SCATTER.  */
       auto vstmt_vinfo = vect_stmt_to_vectorize (stmt_vinfo);
       if (STMT_VINFO_MEMORY_ACCESS_TYPE (vstmt_vinfo) == VMAT_GATHER_SCATTER
+         || STMT_VINFO_STRIDED_P (vstmt_vinfo)
          || STMT_VINFO_GATHER_SCATTER_P (vstmt_vinfo))
        {
          /* ???  As we copy epilogues from the main loop incremental
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 666e0491a9e8..f77a223b0c4f 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2274,7 +2274,8 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
      on nearby locations.  Or, even if it's a win over scalar code,
      it might not be a win over vectorizing at a lower VF, if that
      allows us to use contiguous accesses.  */
-  if (*memory_access_type == VMAT_ELEMENTWISE
+  if ((*memory_access_type == VMAT_ELEMENTWISE
+       || *memory_access_type == VMAT_STRIDED_SLP)
       && single_element_p
       && loop_vinfo
       && vect_use_strided_gather_scatters_p (stmt_info, loop_vinfo,

Reply via email to