vect_model_store_cost had:
/* Costs of the stores. */
if (STMT_VINFO_STRIDED_P (stmt_info)
&& !STMT_VINFO_GROUPED_ACCESS (stmt_info))
{
/* N scalar stores plus extracting the elements. */
inside_cost += record_stmt_cost (body_cost_vec,
ncopies * TYPE_VECTOR_SUBPARTS (vectype),
scalar_store, stmt_info, 0, vect_body);
But non-SLP strided groups also use individual scalar stores rather than
vector stores, so I think we should skip this only for SLP groups.
The same applies to vect_model_load_cost.
Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install?
Thanks,
Richard
gcc/
* tree-vect-stmts.c (vect_model_store_cost): For non-SLP
strided groups, use the cost of N scalar accesses instead
of ncopies vector accesses.
(vect_model_load_cost): Likewise.
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index e90eeda..f883580 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -926,8 +926,7 @@ vect_model_store_cost (stmt_vec_info stmt_info, int ncopies,
tree vectype = STMT_VINFO_VECTYPE (stmt_info);
/* Costs of the stores. */
- if (STMT_VINFO_STRIDED_P (stmt_info)
- && !STMT_VINFO_GROUPED_ACCESS (stmt_info))
+ if (STMT_VINFO_STRIDED_P (stmt_info) && !(slp_node && grouped_access_p))
{
/* N scalar stores plus extracting the elements. */
inside_cost += record_stmt_cost (body_cost_vec,
@@ -1059,8 +1058,7 @@ vect_model_load_cost (stmt_vec_info stmt_info, int
ncopies,
}
/* The loads themselves. */
- if (STMT_VINFO_STRIDED_P (stmt_info)
- && !STMT_VINFO_GROUPED_ACCESS (stmt_info))
+ if (STMT_VINFO_STRIDED_P (stmt_info) && !(slp_node && grouped_access_p))
{
/* N scalar loads plus gathering them into a vector. */
tree vectype = STMT_VINFO_VECTYPE (stmt_info);