On Mon, 6 Nov 2023, Tamar Christina wrote:
> Hi All,
>
> As requested, the vectorizer is now free to pick it's own exit which can be
> different than what the loop CFG infrastucture uses. The vectorizer makes use
> of this to vectorize loops that it previously could not.
>
> But this means that loop control must be materialized in the block that needs
> it
> less we corrupt the SSA chain. This makes it so we use the vectorizer's main
> IV block instead of the loop infra.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
> * tree-ssa-loop-manip.cc (standard_iv_increment_position): Conditionally
> take dest BB.
> * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> (vect_set_loop_condition_normal): Likewise.
>
> --- inline copy of patch --
> diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> index
> bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00301423df111cbe7bf7ba8
> 100644
> --- a/gcc/tree-ssa-loop-manip.h
> +++ b/gcc/tree-ssa-loop-manip.h
> @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool =
> false);
> extern basic_block ip_end_pos (class loop *);
> extern basic_block ip_normal_pos (class loop *);
> extern void standard_iv_increment_position (class loop *,
> - gimple_stmt_iterator *, bool *);
> + gimple_stmt_iterator *, bool *,
> + basic_block = NULL);
> extern bool
> gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned int,
> sbitmap, edge, vec<edge> *, int);
> diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
> index
> e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5dde6c29492ce4d4e4550
> 100644
> --- a/gcc/tree-ssa-loop-manip.cc
> +++ b/gcc/tree-ssa-loop-manip.cc
> @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
>
> /* Stores the standard position for induction variable increment in LOOP
> (just before the exit condition if it is available and latch block is
> empty,
> - end of the latch block otherwise) to BSI. INSERT_AFTER is set to true if
> - the increment should be inserted after *BSI. */
> + end of the latch block otherwise) to BSI. If DEST_BB is specified then
> that
> + basic block is used as the destination instead of the loop latch source
> + block. INSERT_AFTER is set to true if the increment should be inserted
> after
> + *BSI. */
>
> void
> standard_iv_increment_position (class loop *loop, gimple_stmt_iterator *bsi,
> - bool *insert_after)
> + bool *insert_after, basic_block dest_bb)
> {
> - basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> + basic_block bb = dest_bb;
> + if (!bb)
> + bb = ip_normal_pos (loop);
> + basic_block latch = ip_end_pos (loop);
I don't think that's a good API extension. Given that we don't support
an early exit after the main IV exit doesn't this code already work
fine as-is? It chooses the last exit. The position is also not
semantically relevant, we just try to keep the latch empty here
(that is, it's a bit of a "bad" API).
So, do you really need this change?
Maybe we're really using standard_iv_increment_position wrong here,
the result is supposed to _only_ feed the PHI latch argument.
Richard.
> gimple *last = last_nondebug_stmt (latch);
>
> if (!bb
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index
> 6fbb5b80986fd657814b48eb009b52b094f331e6..3d59119787d6afdc5a6465a547d1ea2d3d940373
> 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -531,7 +531,8 @@ vect_set_loop_controls_directly (class loop *loop,
> loop_vec_info loop_vinfo,
> tree index_before_incr, index_after_incr;
> gimple_stmt_iterator incr_gsi;
> bool insert_after;
> - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> + edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> exit_e->src);
> if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> {
> /* Create an IV that counts down from niters_total and whose step
> @@ -1017,7 +1018,8 @@ vect_set_loop_condition_partial_vectors_avx512 (class
> loop *loop,
> tree index_before_incr, index_after_incr;
> gimple_stmt_iterator incr_gsi;
> bool insert_after;
> - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> + exit_edge->src);
> create_iv (niters_adj, MINUS_EXPR, iv_step, NULL_TREE, loop,
> &incr_gsi, insert_after, &index_before_incr,
> &index_after_incr);
> @@ -1185,7 +1187,7 @@ vect_set_loop_condition_partial_vectors_avx512 (class
> loop *loop,
> loop handles exactly VF scalars per iteration. */
>
> static gcond *
> -vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> exit_edge,
> +vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge exit_edge,
> class loop *loop, tree niters, tree step,
> tree final_iv, bool niters_maybe_zero,
> gimple_stmt_iterator loop_cond_gsi)
> @@ -1278,7 +1280,8 @@ vect_set_loop_condition_normal (loop_vec_info /*
> loop_vinfo */, edge exit_edge,
> }
> }
>
> - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> + exit_edge->src);
> create_iv (init, PLUS_EXPR, step, NULL_TREE, loop,
> &incr_gsi, insert_after, &indx_before_incr, &indx_after_incr);
> indx_after_incr = force_gimple_operand_gsi (&loop_cond_gsi,
> indx_after_incr,
>
>
>
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)