When we have the situation of an external SLP node that is permuted the scalar stmts recorded in the permute node do not mean the scalar computation can be removed. We are removing those stmts from the vectorized_scalar_stmts for this reason but we fail to check this set when we cost scalar stmts.
The following fixes this. This shows in PR115777 when we avoid vectorizing the load, but on it's own doesn't help the PR yet. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115777 * tree-vect-slp.cc (vect_bb_slp_scalar_cost): Do not cost a scalar stmt that needs to be preserved. --- gcc/tree-vect-slp.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 337506419d9..152ca433b0e 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -8687,7 +8687,12 @@ vect_bb_slp_scalar_cost (vec_info *vinfo, ssa_op_iter op_iter; def_operand_p def_p; - if (!stmt_info || (*life)[i]) + if (!stmt_info + || (*life)[i] + /* Defs also used in external nodes are not in the + vectorized_scalar_stmts set as they need to be preserved. + Honor that. */ + || !vectorized_scalar_stmts.contains (stmt_info)) continue; stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info); -- 2.43.0