On Thu, Jul 6, 2023 at 11:37 PM Maciej W. Rozycki <ma...@embecosm.com> wrote:
>
> The bb-slp-pr95839.c test assumes quad-single float vector support, but
> some targets only support pairs of floats, causing this test to fail
> with such targets.  Limit this test to targets that support at least
> 128-bit vectors then, and add a complementing test that can be run with
> targets that have support for 64-bit vectors only.  There is no need to
> adjust bb-slp-pr95839-2.c as 128 bits are needed even for the smallest
> vector of doubles, so support is implied by the presence of vectors of
> doubles.

I wonder why you see the testcase FAIL, on x86-64 when doing

typedef float __attribute__((vector_size(32))) v4f32;

v4f32 f(v4f32 a, v4f32 b)
{
  /* Check that we vectorize this CTOR without any loads.  */
  return (v4f32){a[0] + b[0], a[1] + b[1], a[2] + b[2], a[3] + b[3],
  a[4] + b[4], a[5] + b[5], a[6] + b[6], a[7] + b[7]};
}

I see we vectorize the add and the "store".  We fail to perform
extraction from the incoming vectors (unless you enable AVX),
that's a missed optimization.

So with paired floats I would expect sth similar?  Maybe
x86 is saved by kind-of-presence (but disabled) of V8SFmode vectors.

That said, we should handle this better so can you file an
enhancement bugreport for this?

Thanks,
Richard.

>         gcc/testsuite/
>         * gcc.dg/vect/bb-slp-pr95839.c: Limit to `vect128' targets.
>         * gcc.dg/vect/bb-slp-pr95839-v8.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-v8.c |   14 ++++++++++++++
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr95839.c    |    1 +
>  2 files changed, 15 insertions(+)
>
> gcc-test-bb-slp-pr95839-vect128.diff
> Index: gcc/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-v8.c
> ===================================================================
> --- /dev/null
> +++ gcc/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-v8.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_float } */
> +/* { dg-require-effective-target vect64 } */
> +/* { dg-additional-options "-w -Wno-psabi" } */
> +
> +typedef float __attribute__((vector_size(8))) v2f32;
> +
> +v2f32 f(v2f32 a, v2f32 b)
> +{
> +  /* Check that we vectorize this CTOR without any loads.  */
> +  return (v2f32){a[0] + b[0], a[1] + b[1]};
> +}
> +
> +/* { dg-final { scan-tree-dump "optimized: basic block" "slp2" } } */
> Index: gcc/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839.c
> ===================================================================
> --- gcc.orig/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839.c
> +++ gcc/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-require-effective-target vect_float } */
> +/* { dg-require-effective-target vect128 } */
>  /* { dg-additional-options "-w -Wno-psabi" } */
>
>  typedef float __attribute__((vector_size(16))) v4f32;

Reply via email to