Re: [PATCH v2 3/3] aarch64: Add more vector permute tests for the FMOV optimization [PR100165]

Richard Sandiford Mon, 12 May 2025 14:33:55 -0700

Pengxuan Zheng <[email protected]> writes:
> diff --git a/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c 
> b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c
> new file mode 100644
> index 00000000000..adbf87243f6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fmov-3-le.c
> @@ -0,0 +1,130 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mlittle-endian" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +typedef short v4hi __attribute__ ((vector_size (8)));
> +typedef char v8qi __attribute__ ((vector_size (8)));
> +typedef int v4si __attribute__ ((vector_size (16)));
> +typedef float v4sf __attribute__ ((vector_size (16)));
> +typedef short v8hi __attribute__ ((vector_size (16)));
> +typedef char v16qi __attribute__ ((vector_size (16)));
> +
> +/*
> +** f_v4hi:
> +**   fmov    s0, s0
> +**   ret
> +*/
> +v4hi
> +f_v4hi (v4hi x)
> +{
> +  return __builtin_shuffle (x, (v4hi){ 0, 0, 0, 0 }, (v4hi){ 0, 1, 4, 5 });
> +}
> +
> +/*
> +** g_v4hi:
> +**   uzp1    v([0-9]+).2d, v0.2d, v0.2d
> +**   adrp    x([0-9]+), .LC0
> +**   ldr     d([0-9]+), \[x\2, #:lo12:.LC0\]
> +**   tbl     v0.8b, {v\1.16b}, v\3.8b
> +**   ret


The important thing here is that we don't generate FMOV, rather than
that we generate the sequence above.  The test could therefore be:

**      (?:(?!fmov).)*
**      ret

However, the test might be run with a newer architecture that has fp16
enabled, so it would be safer to add:

  #pragma GCC target "armv8-a"

before the typedefs at the top of the file.

LGTM otherwise, thanks.  As with the other patches, please leave 24 hours
for others to comment, but otherwise the patch is ok with the changes above.

Richard

Re: [PATCH v2 3/3] aarch64: Add more vector permute tests for the FMOV optimization [PR100165]

Reply via email to