> > > > Ok, so I currently have the following solution. Let me know if you > > agree with it and I'll polish it up today and tomorrow and respin things. > > > > 1. During vect_update_ivs_after_vectorizer we no longer touch any PHIs > aside from > > Just updating IVtemps with the expected remaining iteration count. > > OK > > > 2. During vect_transform_loop after vectorizing any induction or reduction I > call vectorizable_live_operation > > For any phi node that still has any usages in the early exit merge > > block. > > OK, I suppose you need to amend the vectorizable_live_operation API to tell it > it works for the early exits or the main exit (and not complain when > !STMT_VINFO_LIVE_P for the early exit case). > > > 3. vectorizable_live_operation is taught to have to materialize the > > same PHI in multiple exits > > For the main exit you'd get here via STMT_VINFO_LIVE_P handling and > vect_update_ivs_after_vectorizer would handle the rest. For the early exits I > think you only have to materialize once (in the merge block)? > > > 4. vectorizable_reduction or maybe vect_create_epilog_for_reduction need > to be modified to for early exits materialize > > The previous iteration value. > > I think you need to only touch vect_create_epilog_for_reduction, the early > exit > merge block needs another reduction epilog. Well, in theory just another > vector to reduce but not sure if the control flow supports having the same > actual epilog for both the main and the early exits. > > Richard.
Good morning, Here's the much cleaner respun patch: This changes the PHI node updates to support early breaks. It has to support both the case where the loop's exit matches the normal loop exit and one where the early exit is "inverted", i.e. it's an early exit edge. In the latter case we must always restart the loop for VF iterations. For an early exit the reason is obvious, but there are cases where the "normal" exit is located before the early one. This exit then does a check on ivtmp resulting in us leaving the loop since it thinks we're done. In these case we may still have side-effects to perform so we also go to the scalar loop. For the "normal" exit niters has already been adjusted for peeling, for the early exits we must find out how many iterations we actually did. So we have to recalculate the new position for each exit. For the "inverse" case I know what to do, but I wanted to ask where you wanted it. For inverted cases like ./gcc/testsuite/gcc.dg/vect/vect-early-break_70.c the requirement is that any PHI value aside from the IV needs to be the value of the early exit. i.e. the value of the incomplete exit as there's no iteration that is "complete". The IV should become: niters - (((niters / vf) - 1) * vf) So e.g. on a loop with niters = 17 and VF 4 it becomes 17 - (((17 / 4) - 1) * 4))) = 5. This addresses the odd +step you had commented on before. To do these two I can either modify vect_update_ivs_after_vectorizer, or add a smaller utility function that patched up this case if we want to keep vect_update_ivs_after_vectorizer simple. Which do you prefer? Thanks, Tamar gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_set_loop_condition_normal): Hide unused. (vect_update_ivs_after_vectorizer): Support early break. (vect_do_peeling): Use it. (vect_is_loop_exit_latch_pred): New. * tree-vectorizer.h (vect_is_loop_exit_latch_pred): New. --- inline copy of patch --- diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 5ab883fdeebf1917979fe44eb16356aaef637df7..5751aa6295ca052534cef1984a26c65994a57389 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -1200,7 +1200,7 @@ vect_set_loop_condition_partial_vectors_avx512 (class loop *loop, loop handles exactly VF scalars per iteration. */ static gcond * -vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge exit_edge, +vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge exit_edge, class loop *loop, tree niters, tree step, tree final_iv, bool niters_maybe_zero, gimple_stmt_iterator loop_cond_gsi) @@ -1407,6 +1407,17 @@ vect_set_loop_condition (class loop *loop, edge loop_e, loop_vec_info loop_vinfo (gimple *) cond_stmt); } +/* Determine if the exit choosen by the loop vectorizer differs from the + natural loop exit. i.e. if the exit leads to the loop patch or not. + When this happens we need to flip the understanding of main and other + exits by peeling and IV updates. */ + +bool +vect_is_loop_exit_latch_pred (edge loop_exit, class loop *loop) +{ + return single_pred (loop->latch) == loop_exit->src; +} + /* Given LOOP this function generates a new copy of it and puts it on E which is either the entry or exit of LOOP. If SCALAR_LOOP is non-NULL, assume LOOP and SCALAR_LOOP are equivalent and copy the @@ -2134,6 +2145,10 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) The phi args associated with the edge UPDATE_E in the bb UPDATE_E->dest are updated accordingly. + - MULTIPLE_EXIT - Indicates whether the scalar loop needs to restart the + iteration count where the vector loop began. + - EXIT_BB - The basic block to insert any new statement for UPDATE_E into. + Assumption 1: Like the rest of the vectorizer, this function assumes a single loop exit that has a single predecessor. @@ -2152,17 +2167,14 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo) static void vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, - tree niters, edge update_e) + tree niters, edge update_e, + bool multiple_exit, basic_block exit_bb) { gphi_iterator gsi, gsi1; class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); basic_block update_bb = update_e->dest; - - basic_block exit_bb = LOOP_VINFO_IV_EXIT (loop_vinfo)->dest; - - /* Make sure there exists a single-predecessor exit bb: */ - gcc_assert (single_pred_p (exit_bb)); - gcc_assert (single_succ_edge (exit_bb) == update_e); + gcond *cond = get_loop_exit_condition (LOOP_VINFO_IV_EXIT (loop_vinfo)); + gimple_stmt_iterator last_gsi = gsi_last_bb (exit_bb); for (gsi = gsi_start_phis (loop->header), gsi1 = gsi_start_phis (update_bb); !gsi_end_p (gsi) && !gsi_end_p (gsi1); @@ -2172,7 +2184,6 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, tree step_expr, off; tree type; tree var, ni, ni_name; - gimple_stmt_iterator last_gsi; gphi *phi = gsi.phi (); gphi *phi1 = gsi1.phi (); @@ -2204,11 +2215,27 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, enum vect_induction_op_type induction_type = STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE (phi_info); - if (induction_type == vect_step_op_add) + tree iv_var = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop)); + /* create_iv always places it on the LHS. Alternatively we can set a + property during create_iv to identify it. */ + bool ivtemp = gimple_cond_lhs (cond) == iv_var; + if (multiple_exit && ivtemp) + { + type = TREE_TYPE (gimple_phi_result (phi)); + ni = build_int_cst (type, LOOP_VINFO_VECT_FACTOR (loop_vinfo)); + } + else if (induction_type == vect_step_op_add) { + tree stype = TREE_TYPE (step_expr); - off = fold_build2 (MULT_EXPR, stype, - fold_convert (stype, niters), step_expr); + + /* Early exits always use last iter value not niters. */ + if (multiple_exit) + continue; + else + off = fold_build2 (MULT_EXPR, stype, + fold_convert (stype, niters), step_expr); + if (POINTER_TYPE_P (type)) ni = fold_build_pointer_plus (init_expr, off); else @@ -2227,9 +2254,9 @@ vect_update_ivs_after_vectorizer (loop_vec_info loop_vinfo, var = create_tmp_var (type, "tmp"); - last_gsi = gsi_last_bb (exit_bb); gimple_seq new_stmts = NULL; ni_name = force_gimple_operand (ni, &new_stmts, false, var); + /* Exit_bb shouldn't be empty. */ if (!gsi_end_p (last_gsi)) { @@ -3324,8 +3351,31 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, niters_vector_mult_vf steps. */ gcc_checking_assert (vect_can_advance_ivs_p (loop_vinfo)); update_e = skip_vector ? e : loop_preheader_edge (epilog); + edge alt_exit; + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) + { + for (auto exit : get_loop_exit_edges (loop)) + if (exit != LOOP_VINFO_IV_EXIT (loop_vinfo)) + { + alt_exit = single_succ_edge (exit->dest); + break; + } + update_e = single_succ_edge (e->dest); + } + bool inversed_iv + = !vect_is_loop_exit_latch_pred (LOOP_VINFO_IV_EXIT (loop_vinfo), + LOOP_VINFO_LOOP (loop_vinfo)); + + /* Update the main exit first. */ vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf, - update_e); + update_e, inversed_iv, + LOOP_VINFO_IV_EXIT (loop_vinfo)->dest); + + /* And then update the early exits, we only need to update the alt exit + merge edge, but have to find it first. */ + if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) + vect_update_ivs_after_vectorizer (loop_vinfo, niters_vector_mult_vf, + alt_exit, true, alt_exit->src); if (skip_epilog) { diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 39aa4d1250efe308acccf484d370f8adfd1ba843..22a8c3d384d7ae1ca93079b64f2d40821b4a3c56 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2224,6 +2224,7 @@ extern dump_user_location_t find_loop_location (class loop *); extern bool vect_can_advance_ivs_p (loop_vec_info); extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); extern edge vec_init_loop_exit_info (class loop *); +extern bool vect_is_loop_exit_latch_pred (edge, class loop *); /* In tree-vect-stmts.cc. */ extern tree get_related_vectype_for_scalar_type (machine_mode, tree,
rb17967 (1).patch
Description: rb17967 (1).patch