On Wed, 6 Dec 2023, Tamar Christina wrote:
> > > > +
> > > > + tree truth_type = truth_type_for (vectype_op); machine_mode mode =
> > > > + TYPE_MODE (truth_type); int ncopies;
> > > > +
> >
> > more line break issues ... (also below, check yourself)
> >
> > shouldn't STMT_VINFO_VECTYPE already match truth_type here? If not
> > it looks to be set wrongly (or shouldn't be set at all)
> >
>
> Fixed, I now leverage the existing vect_recog_bool_pattern to update the types
> If needed and determine the initial type in vect_get_vector_types_for_stmt.
>
> > > > + if (slp_node)
> > > > + ncopies = 1;
> > > > + else
> > > > + ncopies = vect_get_num_copies (loop_vinfo, truth_type);
> > > > +
> > > > + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo); bool
> > > > + masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> > > > +
> >
> > what about with_len?
>
> Should be easy to add, but don't know how it works.
>
> >
> > > > + /* Analyze only. */
> > > > + if (!vec_stmt)
> > > > + {
> > > > + if (direct_optab_handler (cbranch_optab, mode) ==
> > > > CODE_FOR_nothing)
> > > > + {
> > > > + if (dump_enabled_p ())
> > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > + "can't vectorize early exit because the "
> > > > + "target doesn't support flag setting
> > > > vector "
> > > > + "comparisons.\n");
> > > > + return false;
> > > > + }
> > > > +
> > > > + if (!expand_vec_cmp_expr_p (vectype_op, truth_type, NE_EXPR))
> >
> > Why NE_EXPR? This looks wrong. Or vectype_op is wrong if you're
> > emitting
> >
> > mask = op0 CMP op1;
> > if (mask != 0)
> >
> > I think you need to check for CMP, not NE_EXPR.
>
> Well CMP is checked by vectorizable_comparison_1, but I realized this
> check is not checking what I wanted and the cbranch requirements
> already do. So removed.
>
> >
> > > > + {
> > > > + if (dump_enabled_p ())
> > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > + "can't vectorize early exit because the "
> > > > + "target does not support boolean vector "
> > > > + "comparisons for type %T.\n",
> > > > truth_type);
> > > > + return false;
> > > > + }
> > > > +
> > > > + if (ncopies > 1
> > > > + && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing)
> > > > + {
> > > > + if (dump_enabled_p ())
> > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> > > > + "can't vectorize early exit because the "
> > > > + "target does not support boolean vector
> > > > OR for "
> > > > + "type %T.\n", truth_type);
> > > > + return false;
> > > > + }
> > > > +
> > > > + if (!vectorizable_comparison_1 (vinfo, truth_type, stmt_info,
> > > > code, gsi,
> > > > + vec_stmt, slp_node, cost_vec))
> > > > + return false;
> >
> > I suppose vectorizable_comparison_1 will check this again, so the above
> > is redundant?
> >
>
> The IOR? No, vectorizable_comparison_1 doesn't reduce so may not check it
> depending on the condition.
>
> > > > + /* Determine if we need to reduce the final value. */
> > > > + if (stmts.length () > 1)
> > > > + {
> > > > + /* We build the reductions in a way to maintain as much
> > > > parallelism as
> > > > + possible. */
> > > > + auto_vec<tree> workset (stmts.length ());
> > > > + workset.splice (stmts);
> > > > + while (workset.length () > 1)
> > > > + {
> > > > + new_temp = make_temp_ssa_name (truth_type, NULL,
> > > > "vexit_reduc");
> > > > + tree arg0 = workset.pop ();
> > > > + tree arg1 = workset.pop ();
> > > > + new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0,
> > > > arg1);
> > > > + vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt,
> > > > + &cond_gsi);
> > > > + if (slp_node)
> > > > + slp_node->push_vec_def (new_stmt);
> > > > + else
> > > > + STMT_VINFO_VEC_STMTS (stmt_info).safe_push (new_stmt);
> > > > + workset.quick_insert (0, new_temp);
> >
> > Reduction epilogue handling has similar code to reduce a set of vectors
> > to a single one with an operation. I think we want to share that code.
> >
>
> I've taken a look but that code isn't suitable here since they have different
> constraints. I don't require an in-order reduction since for the comparison
> all we care about is whether in a lane any bit is set or not. This means:
>
> 1. we can reduce using a fast operation like IOR.
> 2. we can reduce in as much parallelism as possible.
>
> The comparison is on the critical path for the loop now, unlike live
> reductions
> which are always at the end, so using the live reduction code resulted in a
> slow down since it creates a longer dependency chain.
OK.
> > > > + }
> > > > + }
> > > > + else
> > > > + new_temp = stmts[0];
> > > > +
> > > > + gcc_assert (new_temp);
> > > > +
> > > > + tree cond = new_temp;
> > > > + if (masked_loop_p)
> > > > + {
> > > > + tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies,
> > > > truth_type, 0);
> > > > + cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask,
> > > > cond,
> > > > + &cond_gsi);
> >
> > I don't think this is correct when 'stmts' had more than one vector?
> >
>
> It is, because even when VLA, since we only support counted loops partial
> vectors
> are disabled. And it looks like --parm vect-partial-vector-usage=1 cannot
> force it on.
--param vect-partial-vector-usage=2 would, no?
> In principal I suppose I could mask the individual stmts, that should handle
> the future case when
> This is relaxed to supposed non-fix length buffers?
Well, it looks wrong - either put in an assert that we start with a
single stmt or assert !masked_loop_p instead? Better ICE than
generate wrong code.
That said, I think you need to apply the masking on the original
stmts[], before reducing them, no?
Thanks,
Richard.
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-vect-patterns.cc (vect_init_pattern_stmt): Support gconds.
> (check_bool_pattern, adjust_bool_pattern, adjust_bool_stmts,
> vect_recog_bool_pattern): Support gconds type analysis.
> * tree-vect-stmts.cc (vectorizable_comparison_1): Support stmts without
> lhs.
> (vectorizable_early_exit): New.
> (vect_analyze_stmt, vect_transform_stmt): Use it.
> (vect_is_simple_use, vect_get_vector_types_for_stmt): Support gcond.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index
> 7debe7f0731673cd1bf25cd39d55e23990a73d0e..c6cedf4fe7c1f1e1126ce166a059a4b2a2b49cbd
> 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -132,6 +132,7 @@ vect_init_pattern_stmt (vec_info *vinfo, gimple
> *pattern_stmt,
> if (!STMT_VINFO_VECTYPE (pattern_stmt_info))
> {
> gcc_assert (!vectype
> + || is_a <gcond *> (pattern_stmt)
> || (VECTOR_BOOLEAN_TYPE_P (vectype)
> == vect_use_mask_type_p (orig_stmt_info)));
> STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype;
> @@ -5210,19 +5211,27 @@ vect_recog_mixed_size_cond_pattern (vec_info *vinfo,
> true if bool VAR can and should be optimized that way. Assume it
> shouldn't
> in case it's a result of a comparison which can be directly vectorized
> into
> a vector comparison. Fills in STMTS with all stmts visited during the
> - walk. */
> + walk. if COND then a gcond is being inspected instead of a normal COND,
> */
>
> static bool
> -check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts)
> +check_bool_pattern (tree var, vec_info *vinfo, hash_set<gimple *> &stmts,
> + gcond *cond)
> {
> tree rhs1;
> enum tree_code rhs_code;
> + gassign *def_stmt = NULL;
>
> stmt_vec_info def_stmt_info = vect_get_internal_def (vinfo, var);
> - if (!def_stmt_info)
> + if (!def_stmt_info && !cond)
> return false;
> + else if (!def_stmt_info)
> + /* If we're a gcond we won't be codegen-ing the statements and are only
> + after if the types match. In that case we can accept loop invariant
> + values. */
> + def_stmt = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (var));
> + else
> + def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
>
> - gassign *def_stmt = dyn_cast <gassign *> (def_stmt_info->stmt);
> if (!def_stmt)
> return false;
>
> @@ -5234,27 +5243,28 @@ check_bool_pattern (tree var, vec_info *vinfo,
> hash_set<gimple *> &stmts)
> switch (rhs_code)
> {
> case SSA_NAME:
> - if (! check_bool_pattern (rhs1, vinfo, stmts))
> + if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
> return false;
> break;
>
> CASE_CONVERT:
> if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (rhs1)))
> return false;
> - if (! check_bool_pattern (rhs1, vinfo, stmts))
> + if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
> return false;
> break;
>
> case BIT_NOT_EXPR:
> - if (! check_bool_pattern (rhs1, vinfo, stmts))
> + if (! check_bool_pattern (rhs1, vinfo, stmts, cond))
> return false;
> break;
>
> case BIT_AND_EXPR:
> case BIT_IOR_EXPR:
> case BIT_XOR_EXPR:
> - if (! check_bool_pattern (rhs1, vinfo, stmts)
> - || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts))
> + if (! check_bool_pattern (rhs1, vinfo, stmts, cond)
> + || ! check_bool_pattern (gimple_assign_rhs2 (def_stmt), vinfo, stmts,
> + cond))
> return false;
> break;
>
> @@ -5275,6 +5285,7 @@ check_bool_pattern (tree var, vec_info *vinfo,
> hash_set<gimple *> &stmts)
> tree mask_type = get_mask_type_for_scalar_type (vinfo,
> TREE_TYPE (rhs1));
> if (mask_type
> + && !cond
> && expand_vec_cmp_expr_p (comp_vectype, mask_type, rhs_code))
> return false;
>
> @@ -5324,11 +5335,13 @@ adjust_bool_pattern_cast (vec_info *vinfo,
> VAR is an SSA_NAME that should be transformed from bool to a wider integer
> type, OUT_TYPE is the desired final integer type of the whole pattern.
> STMT_INFO is the info of the pattern root and is where pattern stmts
> should
> - be associated with. DEFS is a map of pattern defs. */
> + be associated with. DEFS is a map of pattern defs. If TYPE_ONLY then
> don't
> + create new pattern statements and instead only fill LAST_STMT and DEFS.
> */
>
> static void
> adjust_bool_pattern (vec_info *vinfo, tree var, tree out_type,
> - stmt_vec_info stmt_info, hash_map <tree, tree> &defs)
> + stmt_vec_info stmt_info, hash_map <tree, tree> &defs,
> + gimple *&last_stmt, bool type_only)
> {
> gimple *stmt = SSA_NAME_DEF_STMT (var);
> enum tree_code rhs_code, def_rhs_code;
> @@ -5492,8 +5505,10 @@ adjust_bool_pattern (vec_info *vinfo, tree var, tree
> out_type,
> }
>
> gimple_set_location (pattern_stmt, loc);
> - append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
> - get_vectype_for_scalar_type (vinfo, itype));
> + if (!type_only)
> + append_pattern_def_seq (vinfo, stmt_info, pattern_stmt,
> + get_vectype_for_scalar_type (vinfo, itype));
> + last_stmt = pattern_stmt;
> defs.put (var, gimple_assign_lhs (pattern_stmt));
> }
>
> @@ -5509,11 +5524,14 @@ sort_after_uid (const void *p1, const void *p2)
>
> /* Create pattern stmts for all stmts participating in the bool pattern
> specified by BOOL_STMT_SET and its root STMT_INFO with the desired type
> - OUT_TYPE. Return the def of the pattern root. */
> + OUT_TYPE. Return the def of the pattern root. If TYPE_ONLY the new
> + statements are not emitted as pattern statements and the tree returned is
> + only useful for type queries. */
>
> static tree
> adjust_bool_stmts (vec_info *vinfo, hash_set <gimple *> &bool_stmt_set,
> - tree out_type, stmt_vec_info stmt_info)
> + tree out_type, stmt_vec_info stmt_info,
> + bool type_only = false)
> {
> /* Gather original stmts in the bool pattern in their order of appearance
> in the IL. */
> @@ -5523,16 +5541,16 @@ adjust_bool_stmts (vec_info *vinfo, hash_set <gimple
> *> &bool_stmt_set,
> bool_stmts.quick_push (*i);
> bool_stmts.qsort (sort_after_uid);
>
> + gimple *last_stmt = NULL;
> +
> /* Now process them in that order, producing pattern stmts. */
> hash_map <tree, tree> defs;
> for (unsigned i = 0; i < bool_stmts.length (); ++i)
> adjust_bool_pattern (vinfo, gimple_assign_lhs (bool_stmts[i]),
> - out_type, stmt_info, defs);
> + out_type, stmt_info, defs, last_stmt, type_only);
>
> /* Pop the last pattern seq stmt and install it as pattern root for STMT.
> */
> - gimple *pattern_stmt
> - = gimple_seq_last_stmt (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info));
> - return gimple_assign_lhs (pattern_stmt);
> + return gimple_assign_lhs (last_stmt);
> }
>
> /* Return the proper type for converting bool VAR into
> @@ -5608,13 +5626,22 @@ vect_recog_bool_pattern (vec_info *vinfo,
> enum tree_code rhs_code;
> tree var, lhs, rhs, vectype;
> gimple *pattern_stmt;
> -
> - if (!is_gimple_assign (last_stmt))
> + gcond* cond = NULL;
> + if (!is_gimple_assign (last_stmt)
> + && !(cond = dyn_cast <gcond *> (last_stmt)))
> return NULL;
>
> - var = gimple_assign_rhs1 (last_stmt);
> - lhs = gimple_assign_lhs (last_stmt);
> - rhs_code = gimple_assign_rhs_code (last_stmt);
> + if (is_gimple_assign (last_stmt))
> + {
> + var = gimple_assign_rhs1 (last_stmt);
> + lhs = gimple_assign_lhs (last_stmt);
> + rhs_code = gimple_assign_rhs_code (last_stmt);
> + }
> + else
> + {
> + lhs = var = gimple_cond_lhs (last_stmt);
> + rhs_code = gimple_cond_code (last_stmt);
> + }
>
> if (rhs_code == VIEW_CONVERT_EXPR)
> var = TREE_OPERAND (var, 0);
> @@ -5632,7 +5659,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
> return NULL;
> vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
>
> - if (check_bool_pattern (var, vinfo, bool_stmts))
> + if (check_bool_pattern (var, vinfo, bool_stmts, cond))
> {
> rhs = adjust_bool_stmts (vinfo, bool_stmts,
> TREE_TYPE (lhs), stmt_vinfo);
> @@ -5680,7 +5707,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
>
> return pattern_stmt;
> }
> - else if (rhs_code == COND_EXPR
> + else if ((rhs_code == COND_EXPR || cond)
> && TREE_CODE (var) == SSA_NAME)
> {
> vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (lhs));
> @@ -5700,18 +5727,31 @@ vect_recog_bool_pattern (vec_info *vinfo,
> if (get_vectype_for_scalar_type (vinfo, type) == NULL_TREE)
> return NULL;
>
> - if (check_bool_pattern (var, vinfo, bool_stmts))
> + if (check_bool_pattern (var, vinfo, bool_stmts, cond))
> var = adjust_bool_stmts (vinfo, bool_stmts, type, stmt_vinfo);
> else if (integer_type_for_mask (var, vinfo))
> return NULL;
>
> - lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
> - pattern_stmt
> - = gimple_build_assign (lhs, COND_EXPR,
> - build2 (NE_EXPR, boolean_type_node,
> - var, build_int_cst (TREE_TYPE (var), 0)),
> - gimple_assign_rhs2 (last_stmt),
> - gimple_assign_rhs3 (last_stmt));
> + if (!cond)
> + {
> + lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
> + pattern_stmt
> + = gimple_build_assign (lhs, COND_EXPR,
> + build2 (NE_EXPR, boolean_type_node, var,
> + build_int_cst (TREE_TYPE (var), 0)),
> + gimple_assign_rhs2 (last_stmt),
> + gimple_assign_rhs3 (last_stmt));
> + }
> + else
> + {
> + pattern_stmt
> + = gimple_build_cond (gimple_cond_code (cond), gimple_cond_lhs
> (cond),
> + gimple_cond_rhs (cond),
> + gimple_cond_true_label (cond),
> + gimple_cond_false_label (cond));
> + vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (var));
> + vectype = truth_type_for (vectype);
> + }
> *type_out = vectype;
> vect_pattern_detected ("vect_recog_bool_pattern", last_stmt);
>
> @@ -5725,7 +5765,7 @@ vect_recog_bool_pattern (vec_info *vinfo,
> if (!vectype || !VECTOR_MODE_P (TYPE_MODE (vectype)))
> return NULL;
>
> - if (check_bool_pattern (var, vinfo, bool_stmts))
> + if (check_bool_pattern (var, vinfo, bool_stmts, cond))
> rhs = adjust_bool_stmts (vinfo, bool_stmts,
> TREE_TYPE (vectype), stmt_vinfo);
> else
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index
> 582c5e678fad802d6e76300fe3c939b9f2978f17..d801b72a149ebe6aa4d1f2942324b042d07be530
> 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -12489,7 +12489,7 @@ vectorizable_comparison_1 (vec_info *vinfo, tree
> vectype,
> vec<tree> vec_oprnds0 = vNULL;
> vec<tree> vec_oprnds1 = vNULL;
> tree mask_type;
> - tree mask;
> + tree mask = NULL_TREE;
>
> if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
> return false;
> @@ -12629,8 +12629,9 @@ vectorizable_comparison_1 (vec_info *vinfo, tree
> vectype,
> /* Transform. */
>
> /* Handle def. */
> - lhs = gimple_assign_lhs (STMT_VINFO_STMT (stmt_info));
> - mask = vect_create_destination_var (lhs, mask_type);
> + lhs = gimple_get_lhs (STMT_VINFO_STMT (stmt_info));
> + if (lhs)
> + mask = vect_create_destination_var (lhs, mask_type);
>
> vect_get_vec_defs (vinfo, stmt_info, slp_node, ncopies,
> rhs1, &vec_oprnds0, vectype,
> @@ -12644,7 +12645,10 @@ vectorizable_comparison_1 (vec_info *vinfo, tree
> vectype,
> gimple *new_stmt;
> vec_rhs2 = vec_oprnds1[i];
>
> - new_temp = make_ssa_name (mask);
> + if (lhs)
> + new_temp = make_ssa_name (mask);
> + else
> + new_temp = make_temp_ssa_name (mask_type, NULL, "cmp");
> if (bitop1 == NOP_EXPR)
> {
> new_stmt = gimple_build_assign (new_temp, code,
> @@ -12723,6 +12727,176 @@ vectorizable_comparison (vec_info *vinfo,
> return true;
> }
>
> +/* Check to see if the current early break given in STMT_INFO is valid for
> + vectorization. */
> +
> +static bool
> +vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info,
> + gimple_stmt_iterator *gsi, gimple **vec_stmt,
> + slp_tree slp_node, stmt_vector_for_cost *cost_vec)
> +{
> + loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
> + if (!loop_vinfo
> + || !is_a <gcond *> (STMT_VINFO_STMT (stmt_info)))
> + return false;
> +
> + if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_condition_def)
> + return false;
> +
> + if (!STMT_VINFO_RELEVANT_P (stmt_info))
> + return false;
> +
> + auto code = gimple_cond_code (STMT_VINFO_STMT (stmt_info));
> + tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> + gcc_assert (vectype);
> +
> + tree vectype_op0 = NULL_TREE;
> + slp_tree slp_op0;
> + tree op0;
> + enum vect_def_type dt0;
> + if (!vect_is_simple_use (vinfo, stmt_info, slp_node, 0, &op0, &slp_op0,
> &dt0,
> + &vectype_op0))
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "use not simple.\n");
> + return false;
> + }
> +
> + machine_mode mode = TYPE_MODE (vectype);
> + int ncopies;
> +
> + if (slp_node)
> + ncopies = 1;
> + else
> + ncopies = vect_get_num_copies (loop_vinfo, vectype);
> +
> + vec_loop_masks *masks = &LOOP_VINFO_MASKS (loop_vinfo);
> + bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> +
> + /* Analyze only. */
> + if (!vec_stmt)
> + {
> + if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing)
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "can't vectorize early exit because the "
> + "target doesn't support flag setting vector "
> + "comparisons.\n");
> + return false;
> + }
> +
> + if (ncopies > 1
> + && direct_optab_handler (ior_optab, mode) == CODE_FOR_nothing)
> + {
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> + "can't vectorize early exit because the "
> + "target does not support boolean vector OR for "
> + "type %T.\n", vectype);
> + return false;
> + }
> +
> + if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
> + vec_stmt, slp_node, cost_vec))
> + return false;
> +
> + if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
> + {
> + if (direct_internal_fn_supported_p (IFN_VCOND_MASK_LEN, vectype,
> + OPTIMIZE_FOR_SPEED))
> + return false;
> + else
> + vect_record_loop_mask (loop_vinfo, masks, ncopies, vectype, NULL);
> + }
> +
> +
> + return true;
> + }
> +
> + /* Tranform. */
> +
> + tree new_temp = NULL_TREE;
> + gimple *new_stmt = NULL;
> +
> + if (dump_enabled_p ())
> + dump_printf_loc (MSG_NOTE, vect_location, "transform early-exit.\n");
> +
> + if (!vectorizable_comparison_1 (vinfo, vectype, stmt_info, code, gsi,
> + vec_stmt, slp_node, cost_vec))
> + gcc_unreachable ();
> +
> + gimple *stmt = STMT_VINFO_STMT (stmt_info);
> + basic_block cond_bb = gimple_bb (stmt);
> + gimple_stmt_iterator cond_gsi = gsi_last_bb (cond_bb);
> +
> + auto_vec<tree> stmts;
> +
> + if (slp_node)
> + stmts.safe_splice (SLP_TREE_VEC_DEFS (slp_node));
> + else
> + {
> + auto vec_stmts = STMT_VINFO_VEC_STMTS (stmt_info);
> + stmts.reserve_exact (vec_stmts.length ());
> + for (auto stmt : vec_stmts)
> + stmts.quick_push (gimple_assign_lhs (stmt));
> + }
> +
> + /* Determine if we need to reduce the final value. */
> + if (stmts.length () > 1)
> + {
> + /* We build the reductions in a way to maintain as much parallelism as
> + possible. */
> + auto_vec<tree> workset (stmts.length ());
> + workset.splice (stmts);
> + while (workset.length () > 1)
> + {
> + new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc");
> + tree arg0 = workset.pop ();
> + tree arg1 = workset.pop ();
> + new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1);
> + vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt,
> + &cond_gsi);
> + workset.quick_insert (0, new_temp);
> + }
> + }
> + else
> + new_temp = stmts[0];
> +
> + gcc_assert (new_temp);
> +
> + tree cond = new_temp;
> + /* If we have multiple statements after reduction we should check all the
> + lanes and treat it as a full vector. */
> + if (masked_loop_p)
> + {
> + tree mask = vect_get_loop_mask (loop_vinfo, gsi, masks, ncopies,
> + vectype, 0);
> + cond = prepare_vec_mask (loop_vinfo, TREE_TYPE (mask), mask, cond,
> + &cond_gsi);
> + }
> +
> + /* Now build the new conditional. Pattern gimple_conds get dropped during
> + codegen so we must replace the original insn. */
> + stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info));
> + gcond *cond_stmt = as_a <gcond *>(stmt);
> + gimple_cond_set_condition (cond_stmt, NE_EXPR, cond,
> + build_zero_cst (vectype));
> + update_stmt (stmt);
> +
> + if (slp_node)
> + SLP_TREE_VEC_DEFS (slp_node).truncate (0);
> + else
> + STMT_VINFO_VEC_STMTS (stmt_info).truncate (0);
> +
> +
> + if (!slp_node)
> + *vec_stmt = stmt;
> +
> + return true;
> +}
> +
> /* If SLP_NODE is nonnull, return true if vectorizable_live_operation
> can handle all live statements in the node. Otherwise return true
> if STMT_INFO is not live or if vectorizable_live_operation can handle it.
> @@ -12949,7 +13123,9 @@ vect_analyze_stmt (vec_info *vinfo,
> || vectorizable_lc_phi (as_a <loop_vec_info> (vinfo),
> stmt_info, NULL, node)
> || vectorizable_recurr (as_a <loop_vec_info> (vinfo),
> - stmt_info, NULL, node, cost_vec));
> + stmt_info, NULL, node, cost_vec)
> + || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
> + cost_vec));
> else
> {
> if (bb_vinfo)
> @@ -12972,7 +13148,10 @@ vect_analyze_stmt (vec_info *vinfo,
> NULL, NULL, node, cost_vec)
> || vectorizable_comparison (vinfo, stmt_info, NULL, NULL, node,
> cost_vec)
> - || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec));
> + || vectorizable_phi (vinfo, stmt_info, NULL, node, cost_vec)
> + || vectorizable_early_exit (vinfo, stmt_info, NULL, NULL, node,
> + cost_vec));
> +
> }
>
> if (node)
> @@ -13131,6 +13310,12 @@ vect_transform_stmt (vec_info *vinfo,
> gcc_assert (done);
> break;
>
> + case loop_exit_ctrl_vec_info_type:
> + done = vectorizable_early_exit (vinfo, stmt_info, gsi, &vec_stmt,
> + slp_node, NULL);
> + gcc_assert (done);
> + break;
> +
> default:
> if (!STMT_VINFO_LIVE_P (stmt_info))
> {
> @@ -14321,10 +14506,19 @@ vect_get_vector_types_for_stmt (vec_info *vinfo,
> stmt_vec_info stmt_info,
> }
> else
> {
> + gcond *cond = NULL;
> if (data_reference *dr = STMT_VINFO_DATA_REF (stmt_info))
> scalar_type = TREE_TYPE (DR_REF (dr));
> else if (gimple_call_internal_p (stmt, IFN_MASK_STORE))
> scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
> + else if ((cond = dyn_cast <gcond *> (stmt)))
> + {
> + /* We can't convert the scalar type to boolean yet, since booleans
> have a
> + single bit precision and we need the vector boolean to be a
> + representation of the integer mask. So set the correct integer
> type and
> + convert to boolean vector once we have a vectype. */
> + scalar_type = TREE_TYPE (gimple_cond_lhs (cond));
> + }
> else
> scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
>
> @@ -14339,12 +14533,18 @@ vect_get_vector_types_for_stmt (vec_info *vinfo,
> stmt_vec_info stmt_info,
> "get vectype for scalar type: %T\n", scalar_type);
> }
> vectype = get_vectype_for_scalar_type (vinfo, scalar_type, group_size);
> +
> if (!vectype)
> return opt_result::failure_at (stmt,
> "not vectorized:"
> " unsupported data-type %T\n",
> scalar_type);
>
> + /* If we were a gcond, convert the resulting type to a vector boolean
> type now
> + that we have the correct integer mask type. */
> + if (cond)
> + vectype = truth_type_for (vectype);
> +
> if (dump_enabled_p ())
> dump_printf_loc (MSG_NOTE, vect_location, "vectype: %T\n", vectype);
> }
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)