The following avoids creating unsupported VEC_COND_EXPRs as part of SIMD clone call mask argument setup during vectorization which results in inefficient decomposing of the operation during vector lowering.
Bootstrapped and tested on x86_64-unknown-linux-gnu. Will push on Monday when arm CI is happy. Richard. PR tree-optimization/114164 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Fail if the code generated for mask argument setup is not supported. --- gcc/tree-vect-stmts.cc | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index be0e1a9c69d..14a3ffb5f02 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4210,6 +4210,16 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, " supported for mismatched vector sizes.\n"); return false; } + if (!expand_vec_cond_expr_p (clone_arg_vectype, + arginfo[i].vectype, ERROR_MARK)) + { + if (dump_enabled_p ()) + dump_printf_loc (MSG_MISSED_OPTIMIZATION, + vect_location, + "cannot compute mask argument for" + " in-branch vector clones.\n"); + return false; + } } else if (SCALAR_INT_MODE_P (bestn->simdclone->mask_mode)) { -- 2.35.3