On Wed, Jun 26, 2024 at 4:58 PM Feng Xue OS <f...@os.amperecomputing.com> wrote:
>
> Allow shift-by-induction for slp node, when it is single lane, which is
> aligned with the original loop-based handling.

OK.

Did you try whether we handle multiple lanes correctly?  The simplest
case would be a loop
body with say

  a[2*i] = x << i;
  a[2*i+1] = x << i;

I'm not sure how we match up multiple (different) inductions in the
same SLP node,
but one node could be x << (i + 1).

Note you enable a nested cycle def the same way, I think that could be
treated like
an internal def and also generally.  There's probably no test coverage
for that though.
Sth like

for (m ...)
  {
    i = m;
    j = i + 1;
    for (k ...)
       {
          res1 += k << i;
          res2 += k << j;
          i++;
          j++;
       }
     a[2*m] = res1;
     a[2*m+1] = res2;
  }

Thanks,
Richard.

> Thanks,
> Feng
>
> ---
>  gcc/tree-vect-stmts.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index ca6052662a3..840e162c7f0 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6247,7 +6247,7 @@ vectorizable_shift (vec_info *vinfo,
>    if ((dt[1] == vect_internal_def
>         || dt[1] == vect_induction_def
>         || dt[1] == vect_nested_cycle)
> -      && !slp_node)
> +      && (!slp_node || SLP_TREE_LANES (slp_node) == 1))
>      scalar_shift_arg = false;
>    else if (dt[1] == vect_constant_def
>            || dt[1] == vect_external_def
> --
> 2.17.1

Reply via email to