> -----Original Message----- > From: Christopher Bazley <[email protected]> > Sent: 03 June 2026 16:19 > To: [email protected] > Cc: Tamar Christina <[email protected]>; > [email protected]; [email protected]; Chris Bazley > <[email protected]> > Subject: [PATCH v11 07/12] AArch64/SVE: Relax the expectations of the > popcnt-sve test > > When predicated tails are enabled for basic block SLP vectorization, > the assembly language generated by GCC when compiling popcnt-sve.c > will change. Relax the regular expressions used by this test in > preparation. > > Currently, analysis of f_v8hi succeeds with vector mode V16QI and the > following GIMPLE is produced: > > vector(8) short unsigned intD.19 vect__1.18D.4648; > ... > vect__1.18_69 = MEM <vector(8) short unsigned intD.19> > [(short unsigned intD.19 *)vectp.17_68 clique 1 base 1]; > vect_patt_60.19_70 = .POPCOUNT (vect__1.18_69); > > With predicated tails, analysis instead succeeds with a variable-length > vector mode and the following GIMPLE is produced: > > vector([8,8]) short unsigned intD.19 vect__1.18D.4649; > ... > slp_mask_45 = .WHILE_ULT (0, 8, { 0, ... }); # VUSE <.MEM_25(D)> > vect__1.18_46 = .MASK_LOAD (vectp.17_44, 16B, slp_mask_45, { 0, ... }); > vect_patt_36.19_47 = .POPCOUNT (vect__1.18_46); > > When lowered to RTL, the WHILE_ULT is replaced by > reinterpretation of a V16QI as VNx8HI: > > (insn 7 4 8 2 ( > set (reg:V16QI 107) (mem:V16QI (reg/v/f:DI 103 [ b ]) [1 S16 A16]) > ) "gcc.target/aarch64/popcnt-sve.c":33:8 discrim 1 -1 (nil)) > > (insn 8 7 9 2 ( > set (reg:VNx8HI 106) (subreg:VNx8HI (reg:V16QI 107) 0)) > "gcc.target/aarch64/popcnt-sve.c":33:8 discrim 1 -1 (nil)) > > A mask is still required to lower POPCOUNT, so an all-ones mask > is synthesized: > > (insn 9 8 10 2 (set (reg:VNx16BI 108) > (const_vector:VNx16BI repeat [(const_int 1 [0x1]) > ])) "gcc.target/aarch64/popcnt-sve.c":69:8 discrim 1 -1 > (nil)) > > (insn 10 9 11 2 (set (reg:VNx4SI 105) > (unspec:VNx4SI [ > (subreg:VNx4BI (reg:VNx16BI 108) 0) > (popcount:VNx4SI (reg:VNx4SI 106)) > ] UNSPEC_PRED_X)) > "gcc.target/aarch64/popcnt-sve.c":69:8 discrim 1 -1 > (nil)) > > However, this mask is not the same as the specific-width mask > currently expected by the tests. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/popcnt-sve.c: Update test expectations > to allow both current and alternative valid mask > specifications.
OK. Thanks, Tamar > --- > gcc/testsuite/gcc.target/aarch64/popcnt-sve.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c > b/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c > index c3b4c69b4b4..117a5ca8f1b 100644 > --- a/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c > +++ b/gcc/testsuite/gcc.target/aarch64/popcnt-sve.c > @@ -4,7 +4,7 @@ > > /* > ** f_v4hi: > -** ptrue (p[0-7]).b, vl8 > +** ptrue (p[0-7]).b, (?:vl8|all) > ** ldr d([0-9]+), \[x0\] > ** cnt z\2.h, \1/m, z\2.h > ** str d\2, \[x1\] > @@ -21,7 +21,7 @@ f_v4hi (unsigned short *__restrict b, unsigned short > *__restrict d) > > /* > ** f_v8hi: > -** ptrue (p[0-7]).b, vl16 > +** ptrue (p[0-7]).b, (?:vl16|all) > ** ldr q([0-9]+), \[x0\] > ** cnt z\2.h, \1/m, z\2.h > ** str q\2, \[x1\] > @@ -42,7 +42,7 @@ f_v8hi (unsigned short *__restrict b, unsigned short > *__restrict d) > > /* > ** f_v2si: > -** ptrue (p[0-7]).b, vl8 > +** ptrue (p[0-7]).b, (?:vl8|all) > ** ldr d([0-9]+), \[x0\] > ** cnt z\2.s, \1/m, z\2.s > ** str d\2, \[x1\] > @@ -57,7 +57,7 @@ f_v2si (unsigned int *__restrict b, unsigned int > *__restrict d) > > /* > ** f_v4si: > -** ptrue (p[0-7]).b, vl16 > +** ptrue (p[0-7]).b, (?:vl16|all) > ** ldr q([0-9]+), \[x0\] > ** cnt z\2.s, \1/m, z\2.s > ** str q\2, \[x1\] > @@ -74,7 +74,7 @@ f_v4si (unsigned int *__restrict b, unsigned int > *__restrict d) > > /* > ** f_v2di: > -** ptrue (p[0-7]).b, vl16 > +** ptrue (p[0-7]).b, (?:vl16|all) > ** ldr q([0-9]+), \[x0\] > ** cnt z\2.d, \1/m, z\2.d > ** str q\2, \[x1\] > -- > 2.43.0
