RE: [PATCH] aarch64: Use SVE ASRD with vector division using division operator

Tamar Christina Wed, 22 Oct 2025 06:22:03 -0700

> -----Original Message-----
> From: Richard Biener <[email protected]>
> Sent: 22 October 2025 14:05
> To: Tamar Christina <[email protected]>
> Cc: Soumya AR <[email protected]>; [email protected]; Kyrylo
> Tkachov <[email protected]>
> Subject: RE: [PATCH] aarch64: Use SVE ASRD with vector division using
> division operator
> 
> On Wed, 22 Oct 2025, Tamar Christina wrote:
> 
> > Hi Soumya,
> >
> > > -----Original Message-----
> > > From: Soumya AR <[email protected]>
> > > Sent: 13 October 2025 13:42
> > > To: Tamar Christina <[email protected]>
> > > Cc: [email protected]; Kyrylo Tkachov <[email protected]>;
> > > Richard Biener <[email protected]>
> > > Subject: Re: [PATCH] aarch64: Use SVE ASRD with vector division using
> > > division operator
> > > Importance: High
> > >
> > >
> > >
> > > > On 13 Oct 2025, at 1:48 PM, Tamar Christina
> <[email protected]>
> > > wrote:
> > > >
> > > > External email: Use caution opening links or attachments
> > > >
> > > >
> > > > Hi Soumya,
> > > >
> > > > The 10/13/2025 07:05, Soumya AR wrote:
> > > >> The ASRD instruction on SVE performs an arithmetic shift right by an
> > > immediate
> > > >> for divide. This patch enables ASRD when dividing vectors using the GNU
> C
> > > >> division operator.
> > > >>
> > > >> For example:
> > > >>
> > > >> int32x4_t
> > > >> foo (int32x4_t x)
> > > >> {
> > > >>      return x / 4;
> > > >> }
> > > >>
> > > >> svint32_t
> > > >> bar (svint32_t x)
> > > >> {
> > > >>      return x / 4;
> > > >> }
> > > >>
> > > >> currently generates a DIV, but can be done using ASRD.
> > > >>
> > > >> The patch was bootstrapped and regtested on aarch64-linux-gnu, no
> > > regression.
> > > >> OK for GCC16?
> > > >>
> > > >> Signed-off-by: Soumya AR <[email protected]>
> > > >>
> > > >> gcc/ChangeLog:
> > > >>
> > > >>      * expmed.cc (expand_divmod): Expand to sdiv_pow2 optab for
> vectors.
> > > >>
> > > >> gcc/testsuite/ChangeLog:
> > > >>
> > > >>      * gcc.target/aarch64/sve/sve-asrd-2.c: New test.
> > > >>
> > > >
> > >
> > > Hi Tamar,
> > >
> > > Thanks for the quick review!
> > >
> > > > The test is OK, but you'll need a middle-end maintainer for the 
> > > > expmed.cc
> > > changes.
> > >
> > > My bad, CC'ing Richi here as well.
> > >
> > > > But that said...
> > > >
> > > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > > index df09cbccd08..a35aa229d44 100644
> > > > --- a/gcc/expmed.cc
> > > > +++ b/gcc/expmed.cc
> > > > @@ -4465,6 +4465,24 @@ expand_divmod (int rem_flag, enum
> tree_code
> > > code, machine_mode mode,
> > > >       {
> > > >       case TRUNC_MOD_EXPR:
> > > >       case TRUNC_DIV_EXPR:
> > > > +       if (CONST_VECTOR_P (op1)
> > > > +           && optab_handler (sdiv_pow2_optab, mode) !=
> CODE_FOR_nothing)
> > > > +         {
> > > > +           rtx scalar_op1 = unwrap_const_vec_duplicate (op1);
> > > > +           if (scalar_op1 != NULL_RTX && CONST_INT_P (scalar_op1))
> > > >
> > > > You can simplify this using const_vec_duplicate_p.
> > > >
> > > > +             {
> > > > +               HOST_WIDE_INT d = INTVAL (scalar_op1);
> > > > +               if (d > 0 && pow2p_hwi (d))
> > > > +                 {
> > > > +                   rtx shift_amount
> > > > +                     = gen_const_vec_duplicate (mode,
> > > > +                                                GEN_INT (floor_log2 
> > > > (d)));
> > > > +                   return expand_binop (mode, sdiv_pow2_optab, op0,
> > > > +                                        shift_amount, target, 
> > > > unsignedp,
> > > > +                                        methods);
> > > >
> > > > This expansion can fail, when e.g. the shift amount is larger than the
> bitsize
> > > of the element,
> > > > so you should check the result of it before exiting from expand_divmod.
> > > >
> > >
> > > Makes sense, attaching an updated patch with both changes.
> > >
> >
> > The new patch looks OK to me (it uses the old interface but that's what most
> of
> > expmed uses).
> >
> > I think your cover letter needs to mention that we can't do this in the
> backend because
> > Adv. SIMD lacks a division instruction and that the decomposition is harder
> to match
> > correctly in all cases.
> >
> > That said you still need a middle-end maintainer approval.
> 
> So the backend advertises a vector division?  Othewise I would have
> expected vector lowering to lower this (if it does that at all, for
> VLA vectors?!).
> 
> So this is an alternative better expansion - but there should already
> be generic code to deal with divisions by a power of two, why does
> that not trigger?  And what's this sdiv_pow2 thing again?


Yes, expand intercepts in expand_divmod division by a power of 2 and decomposes
It to shifts.   We could re-recognize the instruction in combine of course, but 
we already
have an optab the vectorizer would have used in this case [us]div_pow2, see
vect_recog_divmod_pattern.  So this is just teaching expand about the special 
division of
a power of 2 optab that the vectorizer uses for normal vector operations.

Thanks,
Tamar

> 
> Richard.
> 
> > Thanks,
> > Tamar
> >
> > > Thanks,
> > > Soumya
> > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > +                 }
> > > > +             }
> > > > +         }
> > > >        if (op1_is_constant)
> > > >          {
> > > >            scalar_int_mode int_mode = as_a <scalar_int_mode>
> > > (compute_mode);
> > >
> >
> >
> 
> --
> Richard Biener <[email protected]>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

RE: [PATCH] aarch64: Use SVE ASRD with vector division using division operator

Reply via email to