> -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: Thursday, March 6, 2025 10:40 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw > <richard.earns...@arm.com>; ktkac...@gcc.gnu.org > Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last > extractions [PR118464] > > Tamar Christina <tamar.christ...@arm.com> writes: > >> -----Original Message----- > >> From: Richard Sandiford <richard.sandif...@arm.com> > >> Sent: Wednesday, March 5, 2025 11:27 AM > >> To: Tamar Christina <tamar.christ...@arm.com> > >> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw > >> <richard.earns...@arm.com>; ktkac...@gcc.gnu.org > >> Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last > >> extractions [PR118464] > >> > >> Tamar Christina <tamar.christ...@arm.com> writes: > >> >> > diff --git a/gcc/config/aarch64/aarch64-sve.md > >> b/gcc/config/aarch64/aarch64- > >> >> sve.md > >> >> > index > >> >> > >> > e975286a01904bec0b283b7ba4afde6f0fd60bf1..6c0be3c1a51449274720175b > >> >> 5e6e7d7535928de6 100644 > >> >> > --- a/gcc/config/aarch64/aarch64-sve.md > >> >> > +++ b/gcc/config/aarch64/aarch64-sve.md > >> >> > @@ -3107,7 +3107,7 @@ (define_insn "@extract_<last_op>_<mode>" > >> >> > [(set (match_operand:<VEL> 0 "register_operand") > >> >> > (unspec:<VEL> > >> >> > [(match_operand:<VPRED> 1 "register_operand") > >> >> > - (match_operand:SVE_FULL 2 "register_operand")] > >> >> > + (match_operand:SVE_ALL 2 "register_operand")] > >> >> > LAST))] > >> >> > "TARGET_SVE" > >> >> > {@ [ cons: =0 , 1 , 2 ] > >> >> > >> >> It looks like this will use (say): > >> >> > >> >> lasta b<n>, pg, z<m>.b > >> >> > >> >> for VNx4QI, is that right? I don't think that's safe, since the .b form > >> >> treats all bits of the pg input as significant, whereas only one in > >> >> every > >> >> four bits of pg is defined for VNx4BI (the predicate associated with > >> >> VNx4QI). > >> >> > >> >> I think converting these patterns to partial vectors means operating > >> >> on containers rather than elements. E.g. the VNx4QI case should use > >> >> .s rather than .b. That should just be a case of changing vwcore to > >> >> vccore and Vetype to Vctype, but I haven't looked too deeply. > >> > > >> > Ah I see, so for partial types, the values are not expected to be packed > >> > in the > >> lower > >> > part of the vector, but instead are "padded"? > >> > >> Right. > >> > >> > That explains some of the other patterns > >> > I was confused about. > >> > > >> > Any ideas how to test these? It's hard to control what modes the > >> > vectorizer > >> picks.. > >> > >> Yeah, agreed. I think it'd be difficult to trigger it reliably from the > >> vectoriser given its current limited use of the ifns. > >> > >> A gimple frontend test might work though, with a predicate/mask > >> generated from (say) 16-bit elements, then bitcast to a predicate/mask > >> for 32-bit elements and used as an input to an explicit ifn on 32-bit > >> elements. If the 16-bit predicate contains 0, 1 for some even-aligned > >> pair, after the last 1, 0 aligned pair, then the code would likely have > >> picked the wrong element. > > > > *insert chaos emoji* I realized the testcases in the testsuite early break > > vect tests needed this since that's how I noticed it anyway.. so it's enough > > to run that with SVE enabled (as there are tests depending on partial > > vectors > > anyway) > > > > So... > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > > > PR tree-optimization/118464 > > PR tree-optimization/116855 > > * config/aarch64/aarch64-sve.md (@extract_<last_op>_<mode>, > > @fold_extract_<last_op>_<mode>, > > @aarch64_fold_extract_vector_<last_op>_<mode>): Change SVE_FULL to > > SVE_ALL/ > > > > -- inline copy of patch -- > > > > diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64- > sve.md > > index > a93bc463a909ea28460cc7877275fce16e05f7e6..205eeec2e35544de848e0dbb > 48e3f5ae59391a88 100644 > > --- a/gcc/config/aarch64/aarch64-sve.md > > +++ b/gcc/config/aarch64/aarch64-sve.md > > @@ -3107,12 +3107,12 @@ (define_insn "@extract_<last_op>_<mode>" > > [(set (match_operand:<VEL> 0 "register_operand") > > (unspec:<VEL> > > [(match_operand:<VPRED> 1 "register_operand") > > - (match_operand:SVE_FULL 2 "register_operand")] > > + (match_operand:SVE_ALL 2 "register_operand")] > > LAST))] > > "TARGET_SVE" > > {@ [ cons: =0 , 1 , 2 ] > > - [ ?r , Upl , w ] last<ab>\t%<vwcore>0, %1, %2.<Vetype> > > - [ w , Upl , w ] last<ab>\t%<Vetype>0, %1, %2.<Vetype> > > + [ ?r , Upl , w ] last<ab>\t%<vwcore>0, %1, %2.<Vctype> > > This looks good for the Vetype->Vctype part, but it still needs the > vwcore->vccore change too. If we don't do that, we could end up pairing > a w destination with a .d source, which would trigger an assembler error. >
The sad part is, I went and checked the instruction before posting this and missed that that wasn't allowed.. arggg.. Tamar > Thanks, > Richard > > > > + [ w , Upl , w ] last<ab>\t%<Vctype>0, %1, %2.<Vctype> > > } > > ) > > > > @@ -8899,26 +8899,26 @@ (define_insn "@fold_extract_<last_op>_<mode>" > > (unspec:<VEL> > > [(match_operand:<VEL> 1 "register_operand") > > (match_operand:<VPRED> 2 "register_operand") > > - (match_operand:SVE_FULL 3 "register_operand")] > > + (match_operand:SVE_ALL 3 "register_operand")] > > CLAST))] > > "TARGET_SVE" > > {@ [ cons: =0 , 1 , 2 , 3 ] > > - [ ?r , 0 , Upl , w ] clast<ab>\t%<vwcore>0, %2, %<vwcore>0, > > %3.<Vetype> > > - [ w , 0 , Upl , w ] clast<ab>\t%<Vetype>0, %2, %<Vetype>0, > > %3.<Vetype> > > + [ ?r , 0 , Upl , w ] clast<ab>\t%<vwcore>0, %2, %<vwcore>0, > > %3.<Vctype> > > + [ w , 0 , Upl , w ] clast<ab>\t%<Vctype>0, %2, %<Vctype>0, > > %3.<Vctype> > > } > > ) > > > > (define_insn "@aarch64_fold_extract_vector_<last_op>_<mode>" > > - [(set (match_operand:SVE_FULL 0 "register_operand") > > - (unspec:SVE_FULL > > - [(match_operand:SVE_FULL 1 "register_operand") > > + [(set (match_operand:SVE_ALL 0 "register_operand") > > + (unspec:SVE_ALL > > + [(match_operand:SVE_ALL 1 "register_operand") > > (match_operand:<VPRED> 2 "register_operand") > > - (match_operand:SVE_FULL 3 "register_operand")] > > + (match_operand:SVE_ALL 3 "register_operand")] > > CLAST))] > > "TARGET_SVE" > > {@ [ cons: =0 , 1 , 2 , 3 ] > > - [ w , 0 , Upl , w ] clast<ab>\t%0.<Vetype>, %2, %0.<Vetype>, > %3.<Vetype> > > - [ ?&w , w , Upl , w ] movprfx\t%0, %1\;clast<ab>\t%0.<Vetype>, > > %2, > %0.<Vetype>, %3.<Vetype> > > + [ w , 0 , Upl , w ] clast<ab>\t%0.<Vctype>, %2, %0.<Vctype>, > %3.<Vctype> > > + [ ?&w , w , Upl , w ] movprfx\t%0, %1\;clast<ab>\t%0.<Vctype>, > > %2, > %0.<Vctype>, %3.<Vctype> > > } > > )