https://gcc.gnu.org/g:cd8db107b9bef73fd822ffb420f96ed2bc622a19

commit r15-5651-gcd8db107b9bef73fd822ffb420f96ed2bc622a19
Author: Richard Biener <rguent...@suse.de>
Date:   Mon Nov 25 13:32:15 2024 +0100

    target/116760 - 416.gamess slowdown with SLP
    
    For the TWOTFF loop vectorization the backend scales constructor
    and vector extract cost to make higher VFs less profitable.  This
    heuristic currently fails to consider VMAT_STRIDED_SLP which we
    now get with single-lane SLP, causing a huge regression in SPEC 2k6
    416.gamess for the respective loop nest.
    
    The following fixes this, matching behavior to that of GCC 14 by
    treating single-lane VMAT_STRIDED_SLP the same as VMAT_ELEMENTWISE.
    
            PR target/116760
            * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
            Scale vec_construct for single-lane VMAT_STRIDED_SLP the
            same as VMAT_ELEMENTWISE.
            * tree-vect-stmts.cc (vectorizable_store): Pass SLP node
            down to costing for vec_to_scalar for VMAT_STRIDED_SLP.

Diff:
---
 gcc/config/i386/i386.cc | 8 +++++---
 gcc/tree-vect-stmts.cc  | 3 ++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 8ab91202f63b..fda2112e4d60 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25340,9 +25340,11 @@ ix86_vector_costs::add_stmt_cost (int count, 
vect_cost_for_stmt kind,
               || (STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
                   == VMAT_GATHER_SCATTER)))
          || (node
-             && ((SLP_TREE_MEMORY_ACCESS_TYPE (node) == VMAT_ELEMENTWISE
-                 && (TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF
-                                           (SLP_TREE_REPRESENTATIVE (node))))
+             && (((SLP_TREE_MEMORY_ACCESS_TYPE (node) == VMAT_ELEMENTWISE
+                   || (SLP_TREE_MEMORY_ACCESS_TYPE (node) == VMAT_STRIDED_SLP
+                       && SLP_TREE_LANES (node) == 1))
+                  && (TREE_CODE (DR_STEP (STMT_VINFO_DATA_REF
+                                            (SLP_TREE_REPRESENTATIVE (node))))
                      != INTEGER_CST))
                  || (SLP_TREE_MEMORY_ACCESS_TYPE (node)
                      == VMAT_GATHER_SCATTER)))))
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 82cd3890e5b9..c2d5818b2786 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -8832,7 +8832,8 @@ vectorizable_store (vec_info *vinfo,
                      if (nstores > 1)
                        inside_cost
                          += record_stmt_cost (cost_vec, 1, vec_to_scalar,
-                                              stmt_info, 0, vect_body);
+                                              stmt_info, slp_node,
+                                              0, vect_body);
                      /* Take a single lane vector type store as scalar
                         store to avoid ICE like 110776.  */
                      if (VECTOR_TYPE_P (ltype)

Reply via email to