Andrew Pinski <quic_apin...@quicinc.com> writes:
> For bar1 and bar2, we currently is expecting to use the bsl instruction but
> with slightly different register allocation inside the loop (which happens 
> after
> the removal of the vcond{,u,eq} patterns), we get the bit instruction.  The 
> pattern that
> outputs bsl instruction will output bit and bif too depending register 
> allocation.
>
> So let's check for bsl, bit or bif instructions instead of just bsl 
> instruction.
>
> Tested on aarch64 both with an unmodified compiler and one which has the 
> patch to disable
> these optabs.
>
> gcc/testsuite/ChangeLog:
>
>       PR testsuite/116041
>       * gcc.target/aarch64/if-compare_2.c: Support bit and bif for
>       both bar1 and bar2; add comment on why too.

OK, thanks.

Richard

>
> Signed-off-by: Andrew Pinski <quic_apin...@quicinc.com>
> ---
>  gcc/testsuite/gcc.target/aarch64/if-compare_2.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/if-compare_2.c 
> b/gcc/testsuite/gcc.target/aarch64/if-compare_2.c
> index 14988abac45..f5a2b1956e3 100644
> --- a/gcc/testsuite/gcc.target/aarch64/if-compare_2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/if-compare_2.c
> @@ -8,6 +8,7 @@
>  
>  typedef int v4si __attribute__ ((vector_size (16)));
>  
> +
>  /*
>  **foo1:
>  **   cmgt    v0.4s, v1.4s, v0.4s
> @@ -29,11 +30,13 @@ v4si foo2 (v4si a, v4si b, v4si c, v4si d) {
>  }
>  
>  
> +/* The bsl could be bit or bif depending on register
> +   allocator inside the loop. */
>  /**
>  **bar1:
>  **...
>  **   cmge    v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s
> -**   bsl     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
> +**   (bsl|bit|bif)   v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
>  **   and     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
>  **...
>  */
> @@ -44,11 +47,13 @@ void bar1 (int * restrict a, int * restrict b, int * 
> restrict c,
>      res[i] = ((a[i] < b[i]) & c[i]) | ((a[i] >= b[i]) & d[i]);
>  }
>  
> +/* The bsl could be bit or bif depending on register
> +   allocator inside the loop. */
>  /**
>  **bar2:
>  **...
>  **   cmge    v[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s
> -**   bsl     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
> +**   (bsl|bit|bif)   v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b
>  **...
>  */
>  void bar2 (int * restrict a, int * restrict b, int * restrict c,

Reply via email to