Hi, in VMAT_STRIDED_SLP we're likely to select a different vectype with fewer elements for vector construction. After loading it is re-interpreted as the proper vectype. When checking costs we use the original vectype with more elements leading to wrong costing in case vector construction is dependent on the number of elements.
This patch makes a temporary copy of stmt_info and slp_node, changes their vectype and passes them to record_stmt_cost in case we chose a different load/construction vectype. Bootstrapped and regtested on x86, aarch64 and power10. Regtested on rv64gcv. Regards Robin PR target/118019 gcc/ChangeLog: * tree-vect-stmts.cc (vectorizable_load): Use construction/load vectype for costing. --- gcc/tree-vect-stmts.cc | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index be1139a423c..6ac1e97c4c1 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -10810,9 +10810,26 @@ vectorizable_load (vec_info *vinfo, if (nloads > 1) { if (costing_p) - inside_cost += record_stmt_cost (cost_vec, 1, vec_construct, - stmt_info, slp_node, 0, - vect_body); + { + if (lvectype != vectype) + { + /* If we chose a different vectype for vector + construction make sure to use it for costing. */ + stmt_vec_info stmt_info_copy = stmt_info; + stmt_info_copy->vectype = lvectype; + slp_tree slp_node_copy = slp_node; + slp_node_copy->vectype = lvectype; + inside_cost + += record_stmt_cost (cost_vec, 1, vec_construct, + stmt_info_copy, slp_node_copy, + 0, vect_body); + } + + else + inside_cost += record_stmt_cost (cost_vec, 1, vec_construct, + stmt_info, slp_node, 0, + vect_body); + } else { tree vec_inv = build_constructor (lvectype, v); -- 2.47.1