On Mon, 13 Nov 2023, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: Tuesday, November 7, 2023 3:04 PM
> > To: Tamar Christina <[email protected]>
> > Cc: [email protected]; nd <[email protected]>; [email protected]
> > Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to
> > support picking an exit other than loop latch
> >
> > On Mon, 6 Nov 2023, Tamar Christina wrote:
> >
> > > Hi All,
> > >
> > > As requested, the vectorizer is now free to pick it's own exit which
> > > can be different than what the loop CFG infrastucture uses. The
> > > vectorizer makes use of this to vectorize loops that it previously could
> > > not.
> > >
> > > But this means that loop control must be materialized in the block
> > > that needs it less we corrupt the SSA chain. This makes it so we use
> > > the vectorizer's main IV block instead of the loop infra.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-loop-manip.cc (standard_iv_increment_position):
> > Conditionally
> > > take dest BB.
> > > * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> > > * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> > > (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> > > (vect_set_loop_condition_normal): Likewise.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> > > index
> > >
> > bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00
> > 301423df1
> > > 11cbe7bf7ba8 100644
> > > --- a/gcc/tree-ssa-loop-manip.h
> > > +++ b/gcc/tree-ssa-loop-manip.h
> > > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool
> > > = false); extern basic_block ip_end_pos (class loop *); extern
> > > basic_block ip_normal_pos (class loop *); extern void
> > > standard_iv_increment_position (class loop *,
> > > - gimple_stmt_iterator *, bool *);
> > > + gimple_stmt_iterator *, bool *,
> > > + basic_block = NULL);
> > > extern bool
> > > gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned
> > int,
> > > sbitmap, edge, vec<edge> *, int);
> > > diff
> > --git
> > > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index
> > >
> > e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d
> > de6c294
> > > 92ce4d4e4550 100644
> > > --- a/gcc/tree-ssa-loop-manip.cc
> > > +++ b/gcc/tree-ssa-loop-manip.cc
> > > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
> > >
> > > /* Stores the standard position for induction variable increment in LOOP
> > > (just before the exit condition if it is available and latch block is
> > > empty,
> > > - end of the latch block otherwise) to BSI. INSERT_AFTER is set to
> > > true if
> > > - the increment should be inserted after *BSI. */
> > > + end of the latch block otherwise) to BSI. If DEST_BB is specified
> > > then that
> > > + basic block is used as the destination instead of the loop latch
> > > source
> > > + block. INSERT_AFTER is set to true if the increment should be
> > > inserted
> > after
> > > + *BSI. */
> > >
> > > void
> > > standard_iv_increment_position (class loop *loop, gimple_stmt_iterator
> > *bsi,
> > > - bool *insert_after)
> > > + bool *insert_after, basic_block dest_bb)
> > > {
> > > - basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> > > + basic_block bb = dest_bb;
> > > + if (!bb)
> > > + bb = ip_normal_pos (loop);
> > > + basic_block latch = ip_end_pos (loop);
> >
> > I don't think that's a good API extension. Given that we don't support an
> > early
> > exit after the main IV exit doesn't this code already work fine as-is? It
> > chooses
> > the last exit. The position is also not semantically relevant, we just try
> > to keep
> > the latch empty here (that is, it's a bit of a "bad" API).
> >
> > So, do you really need this change?
>
> Yes I do, If you look at these kinds of loops
> https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f
>
> You'll see that the main exit, i.e. the one attached to the latch block is
> the early break.
> Because SCEV can't analyze it picks the main exit to be the one in BB4.
>
> This means that the loop control must be placed in BB4. If we place ivtmp_10
> = ivtmp_9 - 1
> In BB 3 then that's broken SSA. If we use `ivtmp_9` in BB4 then we'll have
> an off by one issue.
OK, but then I think the fix is to not use standard_iv_increment_position
(it's a weird API anyway). Instead insert before the main exit condition.
Btw, I assumed this order of main / early exit cannot happen. But I
didn't re-review the main exit identification code yet.
Richard.
> You could have reached the end of the valid range for the loop when you
> re-enter BB4, since
> loads are still allowed you'll then read out of bounds before checking that
> you exit.
>
> This is also annoyingly hard to get correct, which Is what took me a long
> time. Such loops mean
> You need to restart the scalar loop at i_7 if you take the main exit.
>
> Regards,
> Tamar
>
> >
> > Maybe we're really using standard_iv_increment_position wrong here, the
> > result is supposed to _only_ feed the PHI latch argument.
> > Richard.
> >
> > > gimple *last = last_nondebug_stmt (latch);
> > >
> > > if (!bb
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> > >
> > 6fbb5b80986fd657814b48eb009b52b094f331e6..3d59119787d6afdc5a64
> > 65a547d1
> > > ea2d3d940373 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -531,7 +531,8 @@ vect_set_loop_controls_directly (class loop *loop,
> > loop_vec_info loop_vinfo,
> > > tree index_before_incr, index_after_incr;
> > > gimple_stmt_iterator incr_gsi;
> > > bool insert_after;
> > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > + edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > + exit_e->src);
> > > if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> > > {
> > > /* Create an IV that counts down from niters_total and whose
> > > step @@ -1017,7 +1018,8 @@
> > vect_set_loop_condition_partial_vectors_avx512 (class loop *loop,
> > > tree index_before_incr, index_after_incr;
> > > gimple_stmt_iterator incr_gsi;
> > > bool insert_after;
> > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > + exit_edge->src);
> > > create_iv (niters_adj, MINUS_EXPR, iv_step, NULL_TREE, loop,
> > > &incr_gsi, insert_after, &index_before_incr,
> > > &index_after_incr);
> > > @@ -1185,7 +1187,7 @@ vect_set_loop_condition_partial_vectors_avx512
> > (class loop *loop,
> > > loop handles exactly VF scalars per iteration. */
> > >
> > > static gcond *
> > > -vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> > > exit_edge,
> > > +vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge
> > > +exit_edge,
> > > class loop *loop, tree niters, tree step,
> > > tree final_iv, bool niters_maybe_zero,
> > > gimple_stmt_iterator loop_cond_gsi) @@ -
> > 1278,7 +1280,8 @@
> > > vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> > exit_edge,
> > > }
> > > }
> > >
> > > - standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > + exit_edge->src);
> > > create_iv (init, PLUS_EXPR, step, NULL_TREE, loop,
> > > &incr_gsi, insert_after, &indx_before_incr,
> > > &indx_after_incr);
> > > indx_after_incr = force_gimple_operand_gsi (&loop_cond_gsi,
> > > indx_after_incr,
> > >
> > >
> > >
> > >
> > >
> >
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)