On Thu, 25 Aug 2022, Richard Sandiford wrote:

> Builds of glibc with SVE enabled have been failing since V1DI was added
> to the aarch64 port.  The problem is that BB SLP starts the (hopeless)
> attempt to use variable-length modes to vectorise a single-element
> vector, and that now gets further than it did before.
> 
> Initially we tried getting a vector mode with 1 + 1X DI elements
> (i.e. 1 DI per 128-bit vector chunk).  We don't provide such a mode --
> it would be VNx1DI -- because it isn't a native SVE format.  We then
> try just 1 DI, which previously failed but now succeeds.
> 
> There are numerous ways we could fix this.  Perhaps the most obvious
> would be to skip variable-length modes for BB SLP.  However, I think
> that'd just be kicking the can down the road, since eventually we want
> to support BB SLP and VLA vectors using predication.
> 
> However, if we do use VLA vectors for BB SLP, the vector modes
> we use should actually be variable length.  We don't want to use
> variable-length vectors for some element types/group sizes and
> fixed-length vectors for others, since it would be difficult
> to handle the seams.
> 
> The same principle applies during loop vectorisation.  We can't
> use a mixture of variable-length and fixed-length vectors for
> the same loop because the relative unroll/vectorisation factors
> would not be constant (compile-time) multiples of each other.
> 
> This patch therefore makes get_related_vectype_for_scalar_type
> check that the provided number of units is interoperable with
> the provided prevailing mode.  The function is generally quite
> forgiving -- it does basic things like checking for scalarness
> itself rather than expecting callers to do them -- so the new
> check feels in keeping with that.
> 
> This seems to subsume the fix for PR96974.  I'm not sure it's
> worth reverting that code to an assert though, so the patch just
> drops the scan for the associated message.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> gcc/
>       * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Check
>       that the requested number of units is interoperable with the requested
>       prevailing mode.
> 
> gcc/testsuite/
>       * gcc.target/aarch64/sve/slp_15.c: New test.
>       * g++.target/aarch64/sve/pr96974.C: Remove scan test.
> ---
>  gcc/testsuite/g++.target/aarch64/sve/pr96974.C |  4 +---
>  gcc/testsuite/gcc.target/aarch64/sve/slp_15.c  | 17 +++++++++++++++++
>  gcc/tree-vect-stmts.cc                         | 10 ++++++++++
>  3 files changed, 28 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> 
> diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C 
> b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> index 54000f568ab..2f6ebd6ce3d 100644
> --- a/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> +++ b/gcc/testsuite/g++.target/aarch64/sve/pr96974.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4 
> -fdump-tree-slp-details" } */
> +/* { dg-options "-Ofast -march=armv8.2-a+sve -fdisable-tree-fre4" } */
>  
>  float a;
>  int
> @@ -14,5 +14,3 @@ struct c {
>      }
>      int coeffs[10];
>  } f;
> -
> -/* { dg-final { scan-tree-dump "Not vectorized: Incompatible number of 
> vector subparts between" "slp1" { target lp64 } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> new file mode 100644
> index 00000000000..23f6d567cc5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_15.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +struct foo
> +{
> +  void *handle;
> +  void *arg;
> +};
> +
> +void
> +dlinfo_doit (struct foo *args)
> +{
> +  __UINTPTR_TYPE__ **l = args->handle;
> +
> +  *(__UINTPTR_TYPE__ *) args->arg = 0;
> +  *(__UINTPTR_TYPE__ *) args->arg = **l;
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index c9dab217f05..7748c42c70f 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -11486,6 +11486,16 @@ get_related_vectype_for_scalar_type (machine_mode 
> prevailing_mode,
>  
>    unsigned int nbytes = GET_MODE_SIZE (inner_mode);
>  
> +  /* Interoperability between modes requires one to be a constant multiple
> +     of the other, so that the number of vectors required for each operation
> +     is a compile-time constant.  */
> +  if (prevailing_mode != VOIDmode
> +      && !constant_multiple_p (nunits * nbytes,
> +                            GET_MODE_SIZE (prevailing_mode))
> +      && !constant_multiple_p (GET_MODE_SIZE (prevailing_mode),
> +                            nunits * nbytes))
> +    return NULL_TREE;
> +
>    /* For vector types of elements whose mode precision doesn't
>       match their types precision we use a element type of mode
>       precision.  The vectorization routines will have to make sure
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Reply via email to