On Wed, Jun 26, 2024 at 4:58 PM Feng Xue OS <f...@os.amperecomputing.com> wrote: > > Allow shift-by-induction for slp node, when it is single lane, which is > aligned with the original loop-based handling.
OK. Did you try whether we handle multiple lanes correctly? The simplest case would be a loop body with say a[2*i] = x << i; a[2*i+1] = x << i; I'm not sure how we match up multiple (different) inductions in the same SLP node, but one node could be x << (i + 1). Note you enable a nested cycle def the same way, I think that could be treated like an internal def and also generally. There's probably no test coverage for that though. Sth like for (m ...) { i = m; j = i + 1; for (k ...) { res1 += k << i; res2 += k << j; i++; j++; } a[2*m] = res1; a[2*m+1] = res2; } Thanks, Richard. > Thanks, > Feng > > --- > gcc/tree-vect-stmts.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc > index ca6052662a3..840e162c7f0 100644 > --- a/gcc/tree-vect-stmts.cc > +++ b/gcc/tree-vect-stmts.cc > @@ -6247,7 +6247,7 @@ vectorizable_shift (vec_info *vinfo, > if ((dt[1] == vect_internal_def > || dt[1] == vect_induction_def > || dt[1] == vect_nested_cycle) > - && !slp_node) > + && (!slp_node || SLP_TREE_LANES (slp_node) == 1)) > scalar_shift_arg = false; > else if (dt[1] == vect_constant_def > || dt[1] == vect_external_def > -- > 2.17.1