Prathamesh Kulkarni <[email protected]> writes:
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> index 4604365fbef..cedc5b7c549 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c
> @@ -56,7 +56,11 @@ TEST_ALL (DEF_LOOP)
> we're relying on combine to merge a SEL and an arithmetic operation,
> and the SEL doesn't allow the "false" value to be zero when the "true"
> value is a register. */
> -/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+, z[0-9]+\n} 14 } }
> */
> +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+, z[0-9]+\n} 7 } } */
> +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.b, p[0-9]/z,
> z[0-9]+\.b} 1 } } */
> +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.h, p[0-9]/z,
> z[0-9]+\.h} 2 } } */
> +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-9]/z,
> z[0-9]+\.s} 2 } } */
> +/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-9]/z,
> z[0-9]+\.d} 2 } } */
Very minor, but: p[0-7] is more accurate than p[0-9].
OK with that change, thanks.
Richard
>
> /* { dg-final { scan-assembler-not {\tmov\tz[^\n]*z} } } */
> /* { dg-final { scan-assembler-not {\tsel\t} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr93183.c
> b/gcc/testsuite/gcc.target/aarch64/sve/pr93183.c
> new file mode 100644
> index 00000000000..2f92224cecb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr93183.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -mcpu=generic+sve" } */
> +
> +typedef unsigned char uint8_t;
> +
> +static inline uint8_t
> +x264_clip_uint8(uint8_t x)
> +{
> + uint8_t t = -x;
> + uint8_t t1 = x & ~63;
> + return (t1 != 0) ? t : x;
> +}
> +
> +void
> +mc_weight(uint8_t *restrict dst, uint8_t *restrict src, int n)
> +{
> + for (int x = 0; x < n*16; x++)
> + dst[x] = x264_clip_uint8(src[x]);
> +}
> +
> +/* { dg-final { scan-assembler-not {\tsel} } } */