Hi Richard, I have renamed the optabs and associated identifiers as per your suggestion. Thanks.
Regards Yuliang gcc/ChangeLog: 2019-09-27 Yuliang Wang <yuliang.w...@arm.com> * config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3): New pattern for ASRD. * config/aarch64/iterators.md (UNSPEC_ASRD): New unspec. * internal-fn.def (IFN_DIV_POW2): New internal function. * optabs.def (sdiv_pow2_optab): New optab. * tree-vect-patterns.c (vect_recog_divmod_pattern): Modify pattern to support new operation. * doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above. * doc/sourcebuild.texi (vect_sdivpow2_si): Document new target selector. gcc/testsuite/ChangeLog: 2019-09-27 Yuliang Wang <yuliang.w...@arm.com> * gcc.dg/vect/vect-sdivpow2-1.c: New test. * gcc.target/aarch64/sve/asrdiv_1.c: As above. * lib/target-support.exp (check_effective_target_vect_sdivpow2_si): Return true for AArch64 with SVE. -----Original Message----- From: Richard Sandiford <richard.sandif...@arm.com> Sent: 24 September 2019 17:12 To: Yuliang Wang <yuliang.w...@arm.com> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com> Subject: Re: [AArch64][SVE] Utilize ASRD instruction for division and remainder Yuliang Wang <yuliang.w...@arm.com> writes: > Hi, > > The C snippets below (signed division/modulo by a power-of-2 immediate > value): > > #define P ... > > void foo_div (int *a, int *b, int N) > { > for (int i = 0; i < N; i++) > a[i] = b[i] / (1 << P); > } > void foo_mod (int *a, int *b, int N) > { > for (int i = 0; i < N; i++) > a[i] = b[i] % (1 << P); > } > > Vectorize to the following on AArch64 + SVE: > > foo_div: > movx0, 0 > movw2, N > ptruep1.b, all > whilelop0.s, wzr, w2 > .p2align3,,7 > .L2: > ld1wz1.s, p0/z, [x3, x0, lsl 2] > cmpltp2.s, p1/z, z1.s, #0// > movz0.s, p2/z, #7// > addz0.s, z0.s, z1.s// > asrz0.s, z0.s, #3// > st1wz0.s, p0, [x1, x0, lsl 2] > incwx0 > whilelop0.s, w0, w2 > b.any.L2 > ret > > foo_mod: > ... > .L2: > ld1wz0.s, p0/z, [x3, x0, lsl 2] > cmpltp2.s, p1/z, z0.s, #0// > movz1.s, p2/z, #-1// > lsrz1.s, z1.s, #29// > addz0.s, z0.s, z1.s// > andz0.s, z0.s, #{2^P-1}// > subz0.s, z0.s, z1.s// > st1wz0.s, p0, [x1, x0, lsl 2] > incwx0 > whilelop0.s, w0, w2 > b.any.L2 > ret > > This patch utilizes the special-purpose ASRD (arithmetic shift-right for > divide by immediate) instruction: > > foo_div: > ... > .L2: > ld1wz0.s, p0/z, [x3, x0, lsl 2] > asrdz0.s, p1/m, z0.s, #{P}// > st1wz0.s, p0, [x1, x0, lsl 2] > incwx0 > whilelop0.s, w0, w2 > b.any.L2 > ret > > foo_mod: > ... > .L2: > ld1wz0.s, p0/z, [x3, x0, lsl 2] > movprfxz1, z0// > asrdz1.s, p1/m, z1.s, #{P}// > lslz1.s, z1.s, #{P}// > subz0.s, z0.s, z1.s// > st1wz0.s, p0, [x1, x0, lsl 2] > incwx0 > whilelop0.s, w0, w2 > b.any.L2 > ret > > Added new tests. Built and regression tested on aarch64-none-elf. > > Best Regards, > Yuliang Wang > > > gcc/ChangeLog: > > 2019-09-23 Yuliang Wang <yuliang.w...@arm.com> > > * config/aarch64/aarch64-sve.md (asrd<mode>3): New pattern for ASRD. > * config/aarch64/iterators.md (UNSPEC_ASRD): New unspec. > (ASRDIV): New int iterator. > * internal-fn.def (IFN_ASHR_DIV): New internal function. > * optabs.def (ashr_div_optab): New optab. > * tree-vect-patterns.c (vect_recog_divmod_pattern): > Modify pattern to support new operation. > * doc/md.texi (asrd$var{m3}): Documentation for the above. > * doc/sourcebuild.texi (vect_asrdiv_si): Document new target selector. This looks good to me. My only real question is about naming: maybe IFN_DIV_POW2 would be a better name for the internal function and sdiv_pow2_optab/"div_pow2$a3" for the optab? But I'm useless at naming things, so maybe others would prefer your names. Thanks, Richard
rb11863.patch
Description: rb11863.patch