On Mon, 22 Sep 2025, Tamar Christina wrote:
> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: 22 September 2025 12:34
> > To: Pengfei Li <[email protected]>
> > Cc: [email protected]; [email protected]; Tamar Christina
> > <[email protected]>
> > Subject: Re: [PATCH] vect: Support early break with gswitch statements
> >
> > On Thu, Sep 18, 2025 at 2:34 PM Pengfei Li <[email protected]> wrote:
> > >
> > > This patch adds vectorization support to early-break loops with gswitch
> > > statements. Such gswitches may come from original switch-case constructs
> > > in the source or the iftoswitch pass which rewrites if conditions with a
> > > chain of comparisons, like below, to gswitch statements.
> > >
> > > if (a[i] == c1 || a[i] == c2 || a[i] == c3)
> > >
> > > In this patch, the loop exit condition type is generalized from gcond to
> > > gimple, so that gswitch statements can be treated as valid loop exits. A
> > > new vectorizer pattern recognization function is introduced to check if
> > > all non-default labels of the gswitch have the same destination.
> >
> > So the labels should have the same label and the default label should
> > be the fallthru? Like the only handled case is
> >
> > for (;;)
> > {
> > ...
> > switch (x)
> > {
> > case A:
> > case B:
> > case C:
> > /* any code here? single BB? */
> > goto exit;
> > default:;
> > }
> > ...
> > }
> > exit:
> >
> > ?
> >
> > > If yes,
> > > it lowers the gswitch to an equivalent gcond.
> >
> > If so I think it might be simpler to implement this in if-conversion?
> > In fact I think
> > that already does this?
>
> Switch lowering in icvt is really limited. In fact, so limited it's not
> worth extending I think.
> There's a couple of reason this approach was taken:
>
> 1. Today switch lowering if icvt can only handle cases where the return is
> invariant. i.e. it can
> only handle simple data dependencies:
>
> int find (int *a, int n)
> {
> for (int i = 0; i < n; i++)
> {
> if (a[i] == 1 || a[i] == 3 || a[i] == 4 || a[i] == 5)
> return 1;
> }
> return -1;
> }
>
> As soon as you want to know the index of the element you found it doesn't work
>
> int find (int *a, int n)
> {
> for (int i = 0; i < n; i++)
> {
> if (a[i] == 1 || a[i] == 3 || a[i] == 4 || a[i] == 5)
> return i;
> }
> return -1;
> }
>
> Because this loop now has multiple exits where the value of I requires a
> reduction,
> and ifcvt does not handle this:
>
> "Can not ifcvt due to multiple exits"
Yeah, but that's another thing that needs fixing anyway - IIRC we
currently do not handle regular if-conversion in multi-exit loops
because of this.
> While perhaps this could be extended this would require ifcvt to select a
> main exit.
> This would also limit us in future expansion.
Why does ifcvt need to select a main edge?
> For instance in this case
>
> int find2 (int * restrict a, int *b, int n)
> {
> for (int i = 0; i < n; i++)
> {
> switch (a[i])
> {
> case 1:
> case 2:
> case 5:
> switch (b[i])
> {
> case 1:
> case 2:
> case 5:
> return 1;
> default:
> break;
> }
> default:
> break;
> }
> }
> return -1;
> }
>
> Is vectorizable, but ifcvt doesn't handle it. But once support for &&
> blocks are
> added to the vectorizer, we should be able to handle it fine.
Sure.
> The reason we support only one destination today is that as you say, the
> switch is
> equivalent to a gcond. This limitation is there because we can't introduce
> new
> control flow to the vectorizer yet. But this is something we should be able
> to do in
> the future.
Sure.
> I suggested doing this in the vectorizer for two other reason:
>
> 1. You've been adamant in the past about not introducing control dependencies
> into icvt.
> Which will limit the usefulness of ifcvt going forward for control flow.
> 2. I believe the plan is still to move ifcvt into the vectorizer? If so it
> didn't seem worth while
> Extending the very limited switch lowering in ifcvt. It just makes more
> work to more.
I think it belongs in the vectorizer, yes. But I don't see this as a
reason to stop improving it - in fact I think being able to drop it in
full requires to first code generate vector code without prior copying
of the scalar (if-converted) loop.
> Lastly, in cases as the above where we do an NxM compare, AArch64 has
> instructions for this.
> Historically ifcvt has only done conditionalization of expression, but not
> instruction selection.
>
> Ifcvt would also have a hard time handling cases such as:
>
> int find2 (int * restrict a, int *b, int n)
> {
> for (int i = 0; i < n; i++)
> {
> switch (a[i])
> {
> case 1:
> case 2:
> case 5:
> a[i] = i * b[i];
> return i;
> default:
> break;
> }
> }
> return -1;
> }
>
> Which we should be handle in the future.
I think give we have a switch lowering pass (aka switch-to-if), it
should be reasonably "easy" to perform switch-to-if during ifcvt
(on the loop copy for the vectorizer).
But then first handling multi-exit loops in ifcvt at all would be nice
(before trying to handle exit-from-switch).
> I agree that it's slightly not nice needing the change in
> vectorizable_early_exit and we can
> Hopefully get rid of it in the future. But I believe having this inside the
> vectorizer will allow
> us to handle more cases and have a more general solution.
Note I didn't look into the patch details but tried to understand
what exactly the proposed limitation is and compare that to the
existing switch handling in ifcvt.
Richard.
>
> Thanks,
> Tamar
>
> >
> > Richard.
> >
> > > To avoid excessive compile
> > > time, a new parameter vect-max-compare-in-switch-lowering is introduced
> > > to limit the maximum number of comparisons in the gswitch lowering. In
> > > addition, the CRC loop optimization is adjusted a bit to make sure CRC
> > > loops still have gcond exits.
> > >
> > > This patch is bootstrapped and regression-tested on x86_64-linux-gnu,
> > > arm-linux-gnueabihf and aarch64-linux-gnu.
> > >
> > > gcc/ChangeLog:
> > >
> > > * doc/invoke.texi: Document vect-max-compare-in-switch-lowering.
> > > * gimple-crc-optimization.cc
> > (crc_optimization::loop_may_calculate_crc):
> > > Ensure CRC loop exits are still gconds.
> > > (crc_optimization::optimize_crc_loop): Cast CRC loop exits.
> > > * params.opt: Add option vect-max-compare-in-switch-lowering.
> > > * tree-scalar-evolution.cc (scev_dfs::follow_ssa_edge_expr):
> > > Generalize loop exit conditions type from gcond to gimple.
> > > (get_loop_exit_condition): Likewise.
> > > * tree-scalar-evolution.h (get_loop_exit_condition): Likewise.
> > > * tree-vect-loop-manip.cc (vect_set_loop_condition_normal):
> > > Likewise.
> > > (vect_set_loop_condition): Likewise.
> > > (slpeel_can_duplicate_loop_p): Likewise.
> > > * tree-vect-loop.cc (vect_get_loop_niters): Likewise.
> > > (vect_analyze_loop_form): Likewise.
> > > (vect_create_loop_vinfo): Likewise.
> > > * tree-vect-patterns.cc (vect_recog_gswitch_pattern): New
> > > pattern recognition function for gswitch early exits.
> > > * tree-vect-slp.cc (vect_analyze_slp): Bail out if a gswitch is
> > > not lowered into a gcond.
> > > * tree-vect-stmts.cc (vect_mark_stmts_to_be_vectorized): Allow
> > > gswitch statements as loop exits.
> > > (vectorizable_early_exit): Transform gswitch early exits.
> > > * tree-vectorizer.h (struct vect_loop_form_info): Allow gswitch
> > > statements as loop exits.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/vect/vect-switch-search-line-fast.c: Update the test as
> > > the loop is now vectorizable on multiple targets.
> > > * gcc.dg/vect/vect-early-break_139-switch1.c: New test.
> > > * gcc.dg/vect/vect-early-break_139-switch2.c: New test.
> > > * gcc.dg/vect/vect-early-break_139-switch3.c: New test.
> > > * gcc.dg/vect/vect-early-break_139-switch4.c: New test.
> > > ---
> > > gcc/doc/invoke.texi | 4 +
> > > gcc/gimple-crc-optimization.cc | 9 +-
> > > gcc/params.opt | 4 +
> > > .../vect/vect-early-break_139-switch1.c | 19 +++
> > > .../vect/vect-early-break_139-switch2.c | 21 ++++
> > > .../vect/vect-early-break_139-switch3.c | 24 ++++
> > > .../vect/vect-early-break_139-switch4.c | 26 ++++
> > > .../vect/vect-switch-search-line-fast.c | 3 +-
> > > gcc/tree-scalar-evolution.cc | 12 +-
> > > gcc/tree-scalar-evolution.h | 4 +-
> > > gcc/tree-vect-loop-manip.cc | 6 +-
> > > gcc/tree-vect-loop.cc | 10 +-
> > > gcc/tree-vect-patterns.cc | 111 ++++++++++++++++++
> > > gcc/tree-vect-slp.cc | 11 +-
> > > gcc/tree-vect-stmts.cc | 74 ++++++++----
> > > gcc/tree-vectorizer.h | 8 +-
> > > 16 files changed, 295 insertions(+), 51 deletions(-)
> > > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_139-
> > switch1.c
> > > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_139-
> > switch2.c
> > > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_139-
> > switch3.c
> > > create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_139-
> > switch4.c
> > >
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 3cf326cc22c..def84a6675a 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -16652,6 +16652,10 @@ Complex expressions slow the analyzer.
> > > Maximum number of arguments in a PHI supported by TREE if conversion
> > > unless the loop is marked with simd pragma.
> > >
> > > +@item vect-max-compare-in-switch-lowering
> > > +The maximum number of comparisons that can be performed when
> > lowering switch
> > > +statements in the vectorizer.
> > > +
> > > @item vect-max-layout-candidates
> > > The maximum number of possible vector layouts (such as permutations)
> > > to consider when optimizing to-be-vectorized code.
> > > diff --git a/gcc/gimple-crc-optimization.cc
> > > b/gcc/gimple-crc-optimization.cc
> > > index 1751be9bc97..986aa193893 100644
> > > --- a/gcc/gimple-crc-optimization.cc
> > > +++ b/gcc/gimple-crc-optimization.cc
> > > @@ -937,6 +937,12 @@ crc_optimization::loop_may_calculate_crc (class
> > loop *loop)
> > > return false;
> > >
> > > m_crc_loop = loop;
> > > +
> > > + /* Filter out the loops, which don't have a gcond as single exit. */
> > > + gimple *loop_exit_cond = get_loop_exit_condition (m_crc_loop);
> > > + if (loop_exit_cond == NULL || !is_a <gcond *> (loop_exit_cond))
> > > + return false;
> > > +
> > > basic_block *loop_bbs = get_loop_body_in_dom_order (m_crc_loop);
> > >
> > > /* Filter out the cases, which don't have exactly two conditions in
> > > the loop.
> > > @@ -1274,7 +1280,8 @@ crc_optimization::optimize_crc_loop (gphi
> > *output_crc)
> > > remove_phi_node (&tmp_gsi, false);
> > >
> > > /* Alter the exit condition of the loop to always exit. */
> > > - gcond* loop_exit_cond = get_loop_exit_condition (m_crc_loop);
> > > + gcond *loop_exit_cond
> > > + = safe_dyn_cast <gcond *> (get_loop_exit_condition (m_crc_loop));
> > > gimple_cond_make_false (loop_exit_cond);
> > > update_stmt (loop_exit_cond);
> > > return true;
> > > diff --git a/gcc/params.opt b/gcc/params.opt
> > > index dd53d830895..115400ee8d2 100644
> > > --- a/gcc/params.opt
> > > +++ b/gcc/params.opt
> > > @@ -1229,6 +1229,10 @@ Whether to use canonical types.
> > > Common Joined UInteger Var(param_vect_epilogues_nomask) Init(1)
> > IntegerRange(0, 1) Param Optimization NoOffload
> > > Enable loop epilogue vectorization using smaller vector size.
> > >
> > > +-param=vect-max-compare-in-switch-lowering=
> > > +Common Joined UInteger
> > Var(param_vect_max_compare_in_switch_lowering) Init(16) Param
> > Optimization
> > > +Maximum number of comparisons that can be performed when lowering
> > switch statements in the vectorizer.
> > > +
> > > -param=vect-max-layout-candidates=
> > > Common Joined UInteger Var(param_vect_max_layout_candidates) Init(32)
> > Param Optimization
> > > Maximum number of possible vector layouts (such as permutations) to
> > consider when optimizing to-be-vectorized code.
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch1.c
> > b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch1.c
> > > new file mode 100644
> > > index 00000000000..bbe75d20116
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch1.c
> > > @@ -0,0 +1,19 @@
> > > +/* { dg-add-options vect_early_break } */
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target vect_early_break } */
> > > +/* { dg-require-effective-target vect_int } */
> > > +
> > > +/* { dg-additional-options "-Ofast" } */
> > > +
> > > +int find (int *a, int n)
> > > +{
> > > + for (int i = 0; i < n; i++)
> > > + {
> > > + /* This is converted to a switch statement in the iftoswitch pass.
> > > */
> > > + if (a[i] == 1 || a[i] == 2 || a[i] == 5 || a[i] == 7)
> > > + return i;
> > > + }
> > > + return -1;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch2.c
> > b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch2.c
> > > new file mode 100644
> > > index 00000000000..d200bea4594
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch2.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-add-options vect_early_break } */
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target vect_early_break } */
> > > +/* { dg-require-effective-target vect_int } */
> > > +
> > > +/* { dg-additional-options "-Ofast" } */
> > > +
> > > +int find (int *a, int n)
> > > +{
> > > + for (int i = 0; i < n; i++)
> > > + {
> > > + /* This is converted to a switch statement in the iftoswitch pass.
> > > */
> > > + if (a[i] == 1 || a[i] == 3 || a[i] == 4 || a[i] == 5)
> > > + continue;
> > > + else
> > > + return i;
> > > + }
> > > + return -1;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch3.c
> > b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch3.c
> > > new file mode 100644
> > > index 00000000000..f696267a855
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch3.c
> > > @@ -0,0 +1,24 @@
> > > +/* { dg-add-options vect_early_break } */
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target vect_early_break } */
> > > +/* { dg-require-effective-target vect_int } */
> > > +
> > > +/* { dg-additional-options "-Ofast" } */
> > > +
> > > +int find (int *a, int n)
> > > +{
> > > + for (int i = 0; i < n; i++)
> > > + {
> > > + /* An early break in case labels of a switch statement. */
> > > + switch (a[i])
> > > + {
> > > + case -1 ... 1:
> > > + case 10:
> > > + case 20:
> > > + return i;
> > > + }
> > > + }
> > > + return -1;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch4.c
> > b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch4.c
> > > new file mode 100644
> > > index 00000000000..f8e3187843d
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_139-switch4.c
> > > @@ -0,0 +1,26 @@
> > > +/* { dg-add-options vect_early_break } */
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target vect_early_break } */
> > > +/* { dg-require-effective-target vect_int } */
> > > +
> > > +/* { dg-additional-options "-Ofast" } */
> > > +
> > > +int find (int *a, int n)
> > > +{
> > > + for (int i = 0; i < n; i++)
> > > + {
> > > + /* An early break in the default label of a switch statement. */
> > > + switch (a[i])
> > > + {
> > > + case 1 ... 3:
> > > + case 5:
> > > + case 10:
> > > + continue;
> > > + default:
> > > + return i;
> > > + }
> > > + }
> > > + return -1;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> > b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> > > index 678512db319..17e80330daa 100644
> > > --- a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> > > +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> > > @@ -16,5 +16,4 @@ const unsigned char *search_line_fast2 (const
> > unsigned char *s,
> > > return s;
> > > }
> > >
> > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> > > target {
> > ilp32 || { amdgcn*-*-* } } } } } */
> > > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" {
> > > target { !
> > { ilp32 || { amdgcn*-*-* } } } } } } */
> > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> > > diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
> > > index ecdef7529a6..0a5172c3be4 100644
> > > --- a/gcc/tree-scalar-evolution.cc
> > > +++ b/gcc/tree-scalar-evolution.cc
> > > @@ -1302,7 +1302,7 @@ scev_dfs::follow_ssa_edge_expr (gimple
> > *at_stmt, tree expr,
> > > guards the exit edge. If the expression is too difficult to
> > > analyze, then give up. */
> > >
> > > -gcond *
> > > +gimple *
> > > get_loop_exit_condition (const class loop *loop)
> > > {
> > > return get_loop_exit_condition (single_exit (loop));
> > > @@ -1311,16 +1311,20 @@ get_loop_exit_condition (const class loop
> > *loop)
> > > /* If the statement just before the EXIT_EDGE contains a condition then
> > > return the condition, otherwise NULL. */
> > >
> > > -gcond *
> > > +gimple *
> > > get_loop_exit_condition (const_edge exit_edge)
> > > {
> > > - gcond *res = NULL;
> > > + gimple *res = NULL;
> > >
> > > if (dump_file && (dump_flags & TDF_SCEV))
> > > fprintf (dump_file, "(get_loop_exit_condition \n ");
> > >
> > > if (exit_edge)
> > > - res = safe_dyn_cast <gcond *> (*gsi_last_bb (exit_edge->src));
> > > + {
> > > + gimple *g = *gsi_last_bb (exit_edge->src);
> > > + if (is_a <gcond *> (g) || is_a <gswitch *> (g))
> > > + res = g;
> > > + }
> > >
> > > if (dump_file && (dump_flags & TDF_SCEV))
> > > {
> > > diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h
> > > index c7feeacf1db..54f0bf82c7b 100644
> > > --- a/gcc/tree-scalar-evolution.h
> > > +++ b/gcc/tree-scalar-evolution.h
> > > @@ -22,8 +22,8 @@ along with GCC; see the file COPYING3. If not see
> > > #define GCC_TREE_SCALAR_EVOLUTION_H
> > >
> > > extern tree number_of_latch_executions (class loop *);
> > > -extern gcond *get_loop_exit_condition (const class loop *);
> > > -extern gcond *get_loop_exit_condition (const_edge);
> > > +extern gimple *get_loop_exit_condition (const class loop *);
> > > +extern gimple *get_loop_exit_condition (const_edge);
> > >
> > > extern void scev_initialize (void);
> > > extern bool scev_initialized_p (void);
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index 20141dbc2e5..bd04252da2d 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -1233,7 +1233,7 @@ vect_set_loop_condition_normal (loop_vec_info
> > /* loop_vinfo */, edge exit_edge,
> > > {
> > > tree indx_before_incr, indx_after_incr;
> > > gcond *cond_stmt;
> > > - gcond *orig_cond;
> > > + gimple *orig_cond;
> > > edge pe = loop_preheader_edge (loop);
> > > gimple_stmt_iterator incr_gsi;
> > > bool insert_after;
> > > @@ -1395,7 +1395,7 @@ vect_set_loop_condition (class loop *loop, edge
> > loop_e, loop_vec_info loop_vinfo
> > > bool niters_maybe_zero)
> > > {
> > > gcond *cond_stmt;
> > > - gcond *orig_cond = get_loop_exit_condition (loop_e);
> > > + gimple *orig_cond = get_loop_exit_condition (loop_e);
> > > gimple_stmt_iterator loop_cond_gsi = gsi_for_stmt (orig_cond);
> > >
> > > if (loop_vinfo && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo))
> > > @@ -2039,7 +2039,7 @@ slpeel_can_duplicate_loop_p (const class loop
> > *loop, const_edge exit_e,
> > > const_edge e)
> > > {
> > > edge entry_e = loop_preheader_edge (loop);
> > > - gcond *orig_cond = get_loop_exit_condition (exit_e);
> > > + gimple *orig_cond = get_loop_exit_condition (exit_e);
> > > gimple_stmt_iterator loop_exit_gsi = gsi_last_bb (exit_e->src);
> > >
> > > /* All loops have an outer scope; the only case loop->outer is NULL is
> > > for
> > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > > index b58e4355e58..02db9f5a2eb 100644
> > > --- a/gcc/tree-vect-loop.cc
> > > +++ b/gcc/tree-vect-loop.cc
> > > @@ -627,12 +627,12 @@ vect_fixup_scalar_cycles_with_patterns
> > (loop_vec_info loop_vinfo)
> > > Return the loop exit conditions. */
> > >
> > >
> > > -static vec<gcond *>
> > > +static vec<gimple *>
> > > vect_get_loop_niters (class loop *loop, const_edge main_exit, tree
> > *assumptions,
> > > tree *number_of_iterations, tree
> > > *number_of_iterationsm1)
> > > {
> > > auto_vec<edge> exits = get_loop_exit_edges (loop);
> > > - vec<gcond *> conds;
> > > + vec<gimple *> conds;
> > > conds.create (exits.length ());
> > > class tree_niter_desc niter_desc;
> > > tree niter_assumptions, niter, may_be_zero;
> > > @@ -654,7 +654,7 @@ vect_get_loop_niters (class loop *loop, const_edge
> > main_exit, tree *assumptions,
> > > unsigned int i;
> > > FOR_EACH_VEC_ELT (exits, i, exit)
> > > {
> > > - gcond *cond = get_loop_exit_condition (exit);
> > > + gimple *cond = get_loop_exit_condition (exit);
> > > if (cond)
> > > conds.safe_push (cond);
> > >
> > > @@ -1680,7 +1680,7 @@ vect_analyze_loop_form (class loop *loop,
> > gimple *loop_vectorized_call,
> > > /* Determine what the primary and alternate exit conds are. */
> > > for (unsigned i = 0; i < info->conds.length (); i++)
> > > {
> > > - gcond *cond = info->conds[i];
> > > + gimple *cond = info->conds[i];
> > > if (exit_e->src == gimple_bb (cond))
> > > std::swap (info->conds[0], info->conds[i]);
> > > }
> > > @@ -1745,7 +1745,7 @@ vect_create_loop_vinfo (class loop *loop,
> > vec_info_shared *shared,
> > > if (!integer_onep (info->assumptions) && !orig_loop_info)
> > > LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo) = info->assumptions;
> > >
> > > - for (gcond *cond : info->conds)
> > > + for (gimple *cond : info->conds)
> > > {
> > > stmt_vec_info loop_cond_info = loop_vinfo->lookup_stmt (cond);
> > > /* Mark the statement as a condition. */
> > > diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> > > index 70bf768d339..cd39bdfbd0d 100644
> > > --- a/gcc/tree-vect-patterns.cc
> > > +++ b/gcc/tree-vect-patterns.cc
> > > @@ -5529,6 +5529,116 @@ vect_recog_gcond_pattern (vec_info *vinfo,
> > > return pattern_stmt;
> > > }
> > >
> > > +/* Function vect_recog_gswitch_pattern
> > > +
> > > + Try to find pattern like following:
> > > +
> > > + switch (x) <default: <La>, case b1: <Lb>, ... case bN: <Lb>>
> > > +
> > > + where all labels except the default one have the same destination and
> > > + convert it to a boolean pattern
> > > +
> > > + mask = (x == b1) || (x == b2) || ... || (x == bN)
> > > + if (mask != 0)
> > > +
> > > + The type of output TYPE_OUT is set in this function.
> > > +
> > > + Input:
> > > +
> > > + * STMT_VINFO: The stmt at the end from which the pattern
> > > + search begins, i.e., cast of a bool to
> > > + an integer type.
> > > +
> > > + Output:
> > > +
> > > + * TYPE_OUT: The type of the output of this pattern.
> > > +
> > > + * Return value: A new stmt that will be used to replace the pattern.
> > > */
> > > +
> > > +static gimple *
> > > +vect_recog_gswitch_pattern (vec_info *vinfo,
> > > + stmt_vec_info stmt_vinfo, tree *type_out)
> > > +{
> > > + /* Currently we only support this for loop vectorization and when
> > > multiple
> > > + exits. */
> > > + loop_vec_info loop_vinfo = dyn_cast <loop_vec_info> (vinfo);
> > > + if (!loop_vinfo || !LOOP_VINFO_EARLY_BREAKS (loop_vinfo))
> > > + return NULL;
> > > +
> > > + gimple *last_stmt = STMT_VINFO_STMT (stmt_vinfo);
> > > + gswitch *gs = NULL;
> > > + if (!(gs = dyn_cast <gswitch *> (last_stmt)))
> > > + return NULL;
> > > +
> > > + tree index = gimple_switch_index (gs);
> > > + tree stype = TREE_TYPE (index);
> > > +
> > > + /* For now, there should be no gswitch with boolean type index. */
> > > + gcc_assert (!VECT_SCALAR_BOOLEAN_TYPE_P (stype));
> > > + tree vectype = get_mask_type_for_scalar_type (vinfo, stype);
> > > + if (vectype == NULL_TREE)
> > > + return NULL;
> > > +
> > > + tree case_label_dest = NULL_TREE;
> > > + tree cond = NULL_TREE;
> > > + unsigned int num_cmps = 0;
> > > + for (unsigned li = 1; li < gimple_switch_num_labels (gs); li++)
> > > + {
> > > + tree label = gimple_switch_label (gs, li);
> > > +
> > > + /* Check if all case labels have the same destination. */
> > > + tree dest = CASE_LABEL (label);
> > > + if (case_label_dest == NULL)
> > > + case_label_dest = dest;
> > > + else if (dest != case_label_dest)
> > > + return NULL;
> > > +
> > > + tree low = CASE_LOW (label);
> > > + gcc_assert (low != NULL_TREE);
> > > + tree high = CASE_HIGH (label);
> > > + gcc_assert (high == NULL_TREE || tree_int_cst_lt (low, high));
> > > +
> > > + wide_int val = wi::to_wide (low);
> > > + wide_int one = wi::one (TYPE_PRECISION (stype));
> > > + while (true)
> > > + {
> > > + /* A large number of comparisons lead to long compile time and
> > > make
> > > + vectorization non-profitable (most cost models will reject
> > > the
> > > + vectorized code). So bail out in this case. */
> > > + if (++num_cmps > (unsigned)
> > param_vect_max_compare_in_switch_lowering)
> > > + return NULL;
> > > +
> > > + tree eq_cond = vect_recog_temp_ssa_var (boolean_type_node,
> > > NULL);
> > > + gimple *eq_stmt = gimple_build_assign (eq_cond, EQ_EXPR, index,
> > > + wide_int_to_tree (stype,
> > > val));
> > > + append_pattern_def_seq (vinfo, stmt_vinfo, eq_stmt, vectype,
> > > stype);
> > > + if (cond == NULL_TREE)
> > > + cond = eq_cond;
> > > + else
> > > + {
> > > + tree new_cond = vect_recog_temp_ssa_var (boolean_type_node,
> > NULL);
> > > + gimple *or_stmt = gimple_build_assign (new_cond,
> > > BIT_IOR_EXPR,
> > > + cond, eq_cond);
> > > + append_pattern_def_seq (vinfo, stmt_vinfo, or_stmt,
> > > + vectype, stype);
> > > + cond = new_cond;
> > > + }
> > > +
> > > + /* Break if the case label has a single value or we've reached
> > > the
> > > + high value. */
> > > + if (high == NULL_TREE || wi::eq_p (val, wi::to_wide (high)))
> > > + break;
> > > +
> > > + val = wi::add (val, one);
> > > + }
> > > + }
> > > +
> > > + vect_pattern_detected ("vect_recog_gswitch_pattern", last_stmt);
> > > + *type_out = vectype;
> > > + return gimple_build_cond (NE_EXPR, cond, build_zero_cst (TREE_TYPE
> > (cond)),
> > > + NULL_TREE, NULL_TREE);
> > > +}
> > > +
> > > /* Function vect_recog_bool_pattern
> > >
> > > Try to find pattern like following:
> > > @@ -6987,6 +7097,7 @@ static vect_recog_func
> > vect_vect_recog_func_ptrs[] = {
> > > { vect_recog_sat_sub_pattern, "sat_sub" },
> > > { vect_recog_sat_trunc_pattern, "sat_trunc" },
> > > { vect_recog_gcond_pattern, "gcond" },
> > > + { vect_recog_gswitch_pattern, "gswitch" },
> > > { vect_recog_bool_pattern, "bool" },
> > > /* This must come before mask conversion, and includes the parts
> > > of mask conversion that are needed for gather and scatter
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index 895fb88ab7f..217d1299656 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -5409,14 +5409,13 @@ vect_analyze_slp (vec_info *vinfo, unsigned
> > max_tree_size,
> > > vec<stmt_vec_info> roots = vNULL;
> > > roots.safe_push (cond_info);
> > > gimple *stmt = STMT_VINFO_STMT (cond_info);
> > > - tree args0 = gimple_cond_lhs (stmt);
> > > - tree args1 = gimple_cond_rhs (stmt);
> > >
> > > /* These should be enforced by cond lowering, but if it failed
> > > bail. */
> > > - if (gimple_cond_code (stmt) != NE_EXPR
> > > - || TREE_TYPE (args0) != boolean_type_node
> > > - || !integer_zerop (args1))
> > > + if (!is_a <gcond *> (stmt)
> > > + || gimple_cond_code (stmt) != NE_EXPR
> > > + || TREE_TYPE (gimple_cond_lhs (stmt)) != boolean_type_node
> > > + || !integer_zerop (gimple_cond_rhs (stmt)))
> > > {
> > > roots.release ();
> > > return opt_result::failure_at (vect_location,
> > > @@ -5428,7 +5427,7 @@ vect_analyze_slp (vec_info *vinfo, unsigned
> > max_tree_size,
> > > from them. It's highly likely that the resulting SLP tree
> > > here if both
> > > arguments have a def will be incompatible, but we rely on it
> > > being split
> > > later on. */
> > > - auto varg = loop_vinfo->lookup_def (args0);
> > > + auto varg = loop_vinfo->lookup_def (gimple_cond_lhs (stmt));
> > > vec<stmt_vec_info> stmts;
> > > vec<tree> remain = vNULL;
> > > stmts.create (1);
> > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > > index 6274956e2a5..436a4f4fc44 100644
> > > --- a/gcc/tree-vect-stmts.cc
> > > +++ b/gcc/tree-vect-stmts.cc
> > > @@ -743,6 +743,7 @@ vect_mark_stmts_to_be_vectorized (loop_vec_info
> > loop_vinfo, bool *fatal)
> > >
> > > if (gimple_get_lhs (stmt) == NULL_TREE
> > > && !is_a <gcond *> (stmt)
> > > + && !is_a <gswitch *> (stmt)
> > > && !is_a <gcall *> (stmt))
> > > return opt_result::failure_at
> > > (stmt, "not vectorized: irregular stmt: %G", stmt);
> > > @@ -12339,28 +12340,6 @@ vectorizable_early_exit (loop_vec_info
> > loop_vinfo, stmt_vec_info stmt_info,
> > > bool masked_loop_p = LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
> > > bool len_loop_p = LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
> > >
> > > - /* Now build the new conditional. Pattern gimple_conds get dropped
> > during
> > > - codegen so we must replace the original insn. */
> > > - gimple *orig_stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info));
> > > - gcond *cond_stmt = as_a <gcond *>(orig_stmt);
> > > -
> > > - tree vectype_out = vectype;
> > > - auto bb = gimple_bb (cond_stmt);
> > > - edge exit_true_edge = EDGE_SUCC (bb, 0);
> > > - if (exit_true_edge->flags & EDGE_FALSE_VALUE)
> > > - exit_true_edge = EDGE_SUCC (bb, 1);
> > > - gcc_assert (exit_true_edge->flags & EDGE_TRUE_VALUE);
> > > -
> > > - /* When vectorizing we assume that if the branch edge is taken that
> > > we're
> > > - exiting the loop. This is not however always the case as the
> > > compiler will
> > > - rewrite conditions to always be a comparison against 0. To do this
> > > it
> > > - sometimes flips the edges. This is fine for scalar, but for
> > > vector we
> > > - then have to negate the result of the test, as we're still assuming
> > > that if
> > > - you take the branch edge that we found the exit condition. i.e. we
> > > need
> > to
> > > - know whether we are generating a `forall` or an `exist` condition.
> > > */
> > > - bool flipped = flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo),
> > > - exit_true_edge->dest);
> > > -
> > > /* See if we support ADDHN and use that for the reduction. */
> > > internal_fn ifn = IFN_VEC_TRUNC_ADD_HIGH;
> > > bool addhn_supported_p
> > > @@ -12439,6 +12418,53 @@ vectorizable_early_exit (loop_vec_info
> > loop_vinfo, stmt_vec_info stmt_info,
> > > auto_vec<tree> stmts;
> > > stmts.safe_splice (SLP_TREE_VEC_DEFS (slp_node));
> > >
> > > + /* Now build the new conditional. Pattern gimple_conds get dropped
> > during
> > > + codegen so we must replace the original insn. */
> > > + gimple *orig_stmt = STMT_VINFO_STMT (vect_orig_stmt (stmt_info));
> > > + tree vectype_out = vectype;
> > > + auto bb = gimple_bb (orig_stmt);
> > > +
> > > + /* Find the edge which indicates the branch is taken. If original
> > > statement
> > > + is a gcond, we find the edge with flag EDGE_TRUE_VALUE. For
> > > gswitch,
> > we
> > > + find the edge from the non-default label, and set flags
> > EDGE_TRUE_VALUE,
> > > + EDGE_FALSE_VALUE for the two edges. */
> > > + edge exit_true_edge = EDGE_SUCC (bb, 0);
> > > + if (is_a <gcond *> (orig_stmt))
> > > + {
> > > + if (exit_true_edge->flags & EDGE_FALSE_VALUE)
> > > + exit_true_edge = EDGE_SUCC (bb, 1);
> > > + gcc_assert (exit_true_edge->flags & EDGE_TRUE_VALUE);
> > > + }
> > > + else
> > > + {
> > > + gcc_assert (is_a <gswitch *> (orig_stmt));
> > > + gcc_assert (bb->succs->length () == 2);
> > > + tree dft_lab = gimple_switch_default_label (as_a <gswitch *>
> > (orig_stmt));
> > > + basic_block dft_bb = label_to_block (cfun, CASE_LABEL (dft_lab));
> > > + if (exit_true_edge->dest != dft_bb)
> > > + {
> > > + EDGE_SUCC (bb, 0)->flags |= EDGE_TRUE_VALUE;
> > > + EDGE_SUCC (bb, 1)->flags |= EDGE_FALSE_VALUE;
> > > + }
> > > + else
> > > + {
> > > + exit_true_edge = EDGE_SUCC (bb, 1);
> > > + EDGE_SUCC (bb, 0)->flags |= EDGE_FALSE_VALUE;
> > > + EDGE_SUCC (bb, 1)->flags |= EDGE_TRUE_VALUE;
> > > + }
> > > + gcc_assert (exit_true_edge->dest != dft_bb);
> > > + }
> > > +
> > > + /* When vectorizing we assume that if the branch edge is taken that
> > > we're
> > > + exiting the loop. This is not however always the case as the
> > > compiler will
> > > + rewrite conditions to always be a comparison against 0. To do this
> > > it
> > > + sometimes flips the edges. This is fine for scalar, but for
> > > vector we
> > > + then have to negate the result of the test, as we're still assuming
> > > that if
> > > + you take the branch edge that we found the exit condition. i.e.,
> > > we need
> > > + to know whether we are generating a `forall` or an `exist`
> > > condition. */
> > > + bool flipped = flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo),
> > > + exit_true_edge->dest);
> > > +
> > > /* If we're comparing against a previous forall we need to negate the
> > resullts
> > > before we do the final comparison or reduction. */
> > > if (flipped)
> > > @@ -12536,8 +12562,8 @@ vectorizable_early_exit (loop_vec_info
> > loop_vinfo, stmt_vec_info stmt_info,
> > > gcc_assert (new_temp);
> > >
> > > tree cst = build_zero_cst (vectype_out);
> > > - gimple_cond_set_condition (cond_stmt, NE_EXPR, new_temp, cst);
> > > - update_stmt (orig_stmt);
> > > + new_stmt = gimple_build_cond (NE_EXPR, new_temp, cst, NULL_TREE,
> > NULL_TREE);
> > > + gsi_replace (&cond_gsi, new_stmt, true);
> > >
> > > /* ??? */
> > > SLP_TREE_VEC_DEFS (slp_node).truncate (0);
> > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > index b7c2188ab3d..cb715459c00 100644
> > > --- a/gcc/tree-vectorizer.h
> > > +++ b/gcc/tree-vectorizer.h
> > > @@ -1155,10 +1155,10 @@ public:
> > > bool early_breaks;
> > >
> > > /* List of loop additional IV conditionals found in the loop. */
> > > - auto_vec<gcond *> conds;
> > > + auto_vec<gimple *> conds;
> > >
> > > /* Main loop IV cond. */
> > > - gcond* loop_iv_cond;
> > > + gimple *loop_iv_cond;
> > >
> > > /* True if we have an unroll factor requested by the user through
> > > pragma
> > GCC
> > > unroll. */
> > > @@ -2702,8 +2702,8 @@ struct vect_loop_form_info
> > > tree number_of_iterations;
> > > tree number_of_iterationsm1;
> > > tree assumptions;
> > > - auto_vec<gcond *> conds;
> > > - gcond *inner_loop_cond;
> > > + auto_vec<gimple *> conds;
> > > + gimple *inner_loop_cond;
> > > edge loop_exit;
> > > };
> > > extern opt_result vect_analyze_loop_form (class loop *, gimple *,
> > > --
> > > 2.43.0
> > >
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)