The following avoids creating unsupported VEC_COND_EXPRs as part of
SIMD clone call mask argument setup during vectorization which results
in inefficient decomposing of the operation during vector lowering.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

Will push on Monday when arm CI is happy.

Richard.

        PR tree-optimization/114164
        * tree-vect-stmts.cc (vectorizable_simd_clone_call): Fail if
        the code generated for mask argument setup is not supported.
---
 gcc/tree-vect-stmts.cc | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index be0e1a9c69d..14a3ffb5f02 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4210,6 +4210,16 @@ vectorizable_simd_clone_call (vec_info *vinfo, 
stmt_vec_info stmt_info,
                                     " supported for mismatched vector 
sizes.\n");
                  return false;
                }
+             if (!expand_vec_cond_expr_p (clone_arg_vectype,
+                                          arginfo[i].vectype, ERROR_MARK))
+               {
+                 if (dump_enabled_p ())
+                   dump_printf_loc (MSG_MISSED_OPTIMIZATION,
+                                    vect_location,
+                                    "cannot compute mask argument for"
+                                    " in-branch vector clones.\n");
+                 return false;
+               }
            }
          else if (SCALAR_INT_MODE_P (bestn->simdclone->mask_mode))
            {
-- 
2.35.3

Reply via email to