> -----Original Message-----
> From: Christopher Bazley <[email protected]>
> Sent: 03 June 2026 16:19
> To: [email protected]
> Cc: Tamar Christina <[email protected]>;
> [email protected]; [email protected]; Chris Bazley
> <[email protected]>
> Subject: [PATCH v11 07/12] AArch64/SVE: Relax the expectations of the
> popcnt-sve test
> 
> When predicated tails are enabled for basic block SLP vectorization,
> the assembly language generated by GCC when compiling popcnt-sve.c
> will change. Relax the regular expressions used by this test in
> preparation.
> 
> Currently, analysis of f_v8hi succeeds with vector mode V16QI and the
> following GIMPLE is produced:
> 
> vector(8) short unsigned intD.19 vect__1.18D.4648;
> ...
> vect__1.18_69 = MEM <vector(8) short unsigned intD.19>
>   [(short unsigned intD.19 *)vectp.17_68 clique 1 base 1];
> vect_patt_60.19_70 = .POPCOUNT (vect__1.18_69);
> 
> With predicated tails, analysis instead succeeds with a variable-length
> vector mode and the following GIMPLE is produced:
> 
> vector([8,8]) short unsigned intD.19 vect__1.18D.4649;
> ...
> slp_mask_45 = .WHILE_ULT (0, 8, { 0, ... }); # VUSE <.MEM_25(D)>
> vect__1.18_46 = .MASK_LOAD (vectp.17_44, 16B, slp_mask_45, { 0, ... });
> vect_patt_36.19_47 = .POPCOUNT (vect__1.18_46);
> 
> When lowered to RTL, the WHILE_ULT is replaced by
> reinterpretation of a V16QI as VNx8HI:
> 
> (insn 7 4 8 2 (
>   set (reg:V16QI 107) (mem:V16QI (reg/v/f:DI 103 [ b ]) [1 S16 A16])
> ) "gcc.target/aarch64/popcnt-sve.c":33:8 discrim 1 -1 (nil))
> 
> (insn 8 7 9 2 (
>   set (reg:VNx8HI 106) (subreg:VNx8HI (reg:V16QI 107) 0))
>   "gcc.target/aarch64/popcnt-sve.c":33:8 discrim 1 -1 (nil))
> 
> A mask is still required to lower POPCOUNT, so an all-ones mask
> is synthesized:
> 
> (insn 9 8 10 2 (set (reg:VNx16BI 108)
>   (const_vector:VNx16BI repeat [(const_int 1 [0x1])
>   ])) "gcc.target/aarch64/popcnt-sve.c":69:8 discrim 1 -1
> (nil))
> 
> (insn 10 9 11 2 (set (reg:VNx4SI 105)
>   (unspec:VNx4SI [
>     (subreg:VNx4BI (reg:VNx16BI 108) 0)
>     (popcount:VNx4SI (reg:VNx4SI 106))
>   ] UNSPEC_PRED_X))
>   "gcc.target/aarch64/popcnt-sve.c":69:8 discrim 1 -1
> (nil))
> 
> However, this mask is not the same as the specific-width mask
> currently expected by the tests.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/aarch64/popcnt-sve.c: Update test expectations
>       to allow both current and alternative valid mask
>       specifications.

OK.

Thanks,
Tamar

> ---
>  gcc/testsuite/gcc.target/aarch64/popcnt-sve.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c
> b/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c
> index c3b4c69b4b4..117a5ca8f1b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c
> +++ b/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c
> @@ -4,7 +4,7 @@
> 
>  /*
>  ** f_v4hi:
> -**   ptrue   (p[0-7]).b, vl8
> +**   ptrue   (p[0-7]).b, (?:vl8|all)
>  **   ldr     d([0-9]+), \[x0\]
>  **   cnt     z\2.h, \1/m, z\2.h
>  **   str     d\2, \[x1\]
> @@ -21,7 +21,7 @@ f_v4hi (unsigned short *__restrict b, unsigned short
> *__restrict d)
> 
>  /*
>  ** f_v8hi:
> -**   ptrue   (p[0-7]).b, vl16
> +**   ptrue   (p[0-7]).b, (?:vl16|all)
>  **   ldr     q([0-9]+), \[x0\]
>  **   cnt     z\2.h, \1/m, z\2.h
>  **   str     q\2, \[x1\]
> @@ -42,7 +42,7 @@ f_v8hi (unsigned short *__restrict b, unsigned short
> *__restrict d)
> 
>  /*
>  ** f_v2si:
> -**   ptrue   (p[0-7]).b, vl8
> +**   ptrue   (p[0-7]).b, (?:vl8|all)
>  **   ldr     d([0-9]+), \[x0\]
>  **   cnt     z\2.s, \1/m, z\2.s
>  **   str     d\2, \[x1\]
> @@ -57,7 +57,7 @@ f_v2si (unsigned int *__restrict b, unsigned int
> *__restrict d)
> 
>  /*
>  ** f_v4si:
> -**   ptrue   (p[0-7]).b, vl16
> +**   ptrue   (p[0-7]).b, (?:vl16|all)
>  **   ldr     q([0-9]+), \[x0\]
>  **   cnt     z\2.s, \1/m, z\2.s
>  **   str     q\2, \[x1\]
> @@ -74,7 +74,7 @@ f_v4si (unsigned int *__restrict b, unsigned int
> *__restrict d)
> 
>  /*
>  ** f_v2di:
> -**   ptrue   (p[0-7]).b, vl16
> +**   ptrue   (p[0-7]).b, (?:vl16|all)
>  **   ldr     q([0-9]+), \[x0\]
>  **   cnt     z\2.d, \1/m, z\2.d
>  **   str     q\2, \[x1\]
> --
> 2.43.0

Reply via email to