When vectorizing an early break loop with LENs (do we miss some check here to disallow this?) we can end up deciding to insert stmts after a GIMPLE_COND when doing SLP scheduling and trying to be conservative with placing of stmts only dependent on the implicit loop mask/len. The following avoids this, I guess it's not perfect but it does the job fixing some observed RISC-V regression.
* tree-vect-slp.cc (vect_schedule_slp_node): For mask/len loops make sure to not advance the insertion iterator beyond a GIMPLE_COND. --- gcc/tree-vect-slp.cc | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index bf1f467f53f..11ec82086fc 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -9650,7 +9650,12 @@ vect_schedule_slp_node (vec_info *vinfo, else { si = gsi_for_stmt (last_stmt); - gsi_next (&si); + /* When we're getting gsi_after_labels from the starting + condition of a fully masked/len loop avoid insertion + after a GIMPLE_COND that can appear as the only header + stmt with early break vectorization. */ + if (gimple_code (last_stmt) != GIMPLE_COND) + gsi_next (&si); } } -- 2.35.3