On Fri, 29 Dec 2023, Tamar Christina wrote:

> Hi All,
> 
> We can't support nonlinear inductions other than neg when vectorizing
> early breaks and iteration count is known.
> 
> For early break we currently require a peeled epilog but in these cases
> we can't compute the remaining values.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> tested on cross cc1 for amdgcn-amdhsa and issue fixed.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>       PR middle-end/113163
>       * tree-vect-loop-manip.cc (vect_can_peel_nonlinear_iv_p):

Misses sth.

> gcc/testsuite/ChangeLog:
> 
>       PR middle-end/113163
>       * gcc.target/gcn/pr113163.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.target/gcn/pr113163.c 
> b/gcc/testsuite/gcc.target/gcn/pr113163.c
> new file mode 100644
> index 
> 0000000000000000000000000000000000000000..99b0fdbaf3a3152ca008b5109abf6e80d8cb3d6a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/gcn/pr113163.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O2 -ftree-vectorize" } */ 
> +
> +struct _reent { union { struct { char _l64a_buf[8]; } _reent; } _new; };
> +static const char R64_ARRAY[] = 
> "./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
> +char *
> +_l64a_r (struct _reent *rptr,
> +     long value)
> +{
> +  char *ptr;
> +  char *result;
> +  int i, index;
> +  unsigned long tmp = (unsigned long)value & 0xffffffff;
> +  result = 
> +          ((
> +          rptr
> +          )->_new._reent._l64a_buf)
> +                               ;
> +  ptr = result;
> +  for (i = 0; i < 6; ++i)
> +    {
> +      if (tmp == 0)
> + {
> +   *ptr = '\0';
> +   break;
> + }
> +      *ptr++ = R64_ARRAY[index];
> +      tmp >>= 6;
> +    }
> +}
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index 
> 3810983a80c8b989be9fd9a9993642069fd39b99..f1bf43b3731868e7b053c186302fbeaf515be8cf
>  100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -2075,6 +2075,22 @@ vect_can_peel_nonlinear_iv_p (loop_vec_info loop_vinfo,
>        return false;
>      }
>  
> +  /* We can't support partial vectors and early breaks with an induction
> +     type other than add or neg since we require the epilog and can't
> +     perform the peeling.  PR113163.  */
> +  if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
> +      && LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()

But why's that only for constant VF?  We might never end up here
with variable VF but the check looks odd ...

OK with that clarified and/or the test removed.

Thanks,
Richard.

> +      && LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo)
> +      && induction_type != vect_step_op_neg)
> +    {
> +      if (dump_enabled_p ())
> +     dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +                      "Peeling for epilogue is not supported"
> +                      " for nonlinear induction except neg"
> +                      " when iteration count is known and early breaks.\n");
> +      return false;
> +    }
> +
>    return true;
>  }
>  
> 
> 
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to