Tamar Christina <tamar.christ...@arm.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandif...@arm.com>
>> Sent: Monday, May 10, 2021 5:49 PM
>> To: Tamar Christina <tamar.christ...@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Richard Earnshaw
>> <richard.earns...@arm.com>; Marcus Shawcroft
>> <marcus.shawcr...@arm.com>; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
>> Subject: Re: [PATCH 2/4]AArch64: Add support for sign differing dot-product
>> usdot for NEON and SVE.
>>
>> Tamar Christina <tamar.christ...@arm.com> writes:
>> > diff --git a/gcc/config/aarch64/aarch64-simd.md
>> > b/gcc/config/aarch64/aarch64-simd.md
>> > index
>> >
>> 4edee99051c4e2112b546becca47da32aae21df2..c9fb8e702732dd311fb10de1
>> 7126
>> > 432e2a63a32b 100644
>> > --- a/gcc/config/aarch64/aarch64-simd.md
>> > +++ b/gcc/config/aarch64/aarch64-simd.md
>> > @@ -648,6 +648,22 @@ (define_expand "<sur>dot_prod<vsi2qi>"
>> >    DONE;
>> >  })
>> >
>> > +;; Auto-vectorizer pattern for usdot
>> > +(define_expand "usdot_prod<vsi2qi>"
>> > +  [(set (match_operand:VS 0 "register_operand")
>> > +   (plus:VS (unspec:VS [(match_operand:<VSI2QI> 1
>> "register_operand")
>> > +                       (match_operand:<VSI2QI> 2 "register_operand")]
>> > +            UNSPEC_USDOT)
>> > +           (match_operand:VS 3 "register_operand")))]
>> > +  "TARGET_I8MM"
>> > +{
>> > +  emit_insn (
>> > +    gen_aarch64_usdot<vsi2qi> (operands[3], operands[3], operands[1],
>> > +                          operands[2]));
>> > +  emit_move_insn (operands[0], operands[3]);
>> > +  DONE;
>> > +})
>>
>> We can't modify operands[3] here; it's an input rather than an output.
>
> Sorry, I should have noticed this.. I had blindly copied the existing pattern 
> for dot-product and that looks like it's wrong.
> I'll send a different patch to fix that one.
>
>>
>> It looks like this would work with just the {…} removed though.
>> The pattern will match aarch64_usdot<vsi2qi> on its own accord.
>>
>> Even better would be to rename __builtin_aarch64_usdot… to
>> __builtin_usdot_prod…, change its arguments so that they line up with the
>> optabs, and change arm_neon.h to match.
>>
>> > diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
>> > b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
>> > new file mode 100644
>> > index
>> >
>> 0000000000000000000000000000000000000000..b99a945903c043c7410becaf6f
>> 09
>> > 496dd038410d
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/aarch64/simd/vusdot-autovec.c
>> > @@ -0,0 +1,38 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm" } */
>> > +
>> > +#define N 480
>> > +#define SIGNEDNESS_1 unsigned
>> > +#define SIGNEDNESS_2 signed
>> > +#define SIGNEDNESS_3 signed
>> > +#define SIGNEDNESS_4 unsigned
>> > +
>> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
>> > +SIGNEDNESS_3 char *restrict a,
>> > +   SIGNEDNESS_4 char *restrict b)
>> > +{
>> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
>> > +    {
>> > +      int av = a[i];
>> > +      int bv = b[i];
>> > +      SIGNEDNESS_2 short mult = av * bv;
>> > +      res += mult;
>> > +    }
>> > +  return res;
>> > +}
>> > +
>> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
>> > +SIGNEDNESS_3 char *restrict b,
>> > +   SIGNEDNESS_4 char *restrict a)
>> > +{
>> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
>> > +    {
>> > +      int av = a[i];
>> > +      int bv = b[i];
>> > +      SIGNEDNESS_2 short mult = av * bv;
>> > +      res += mult;
>> > +    }
>> > +  return res;
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
>> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
>> > b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
>> > new file mode 100644
>> > index
>> >
>> 0000000000000000000000000000000000000000..094dd51cea62e0ba05ec35056
>> 57b
>> > f05320e5fdbb
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/vusdot-autovec.c
>> > @@ -0,0 +1,38 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-O3 -march=armv8.2-a+i8mm+sve" } */
>> > +
>> > +#define N 480
>> > +#define SIGNEDNESS_1 unsigned
>> > +#define SIGNEDNESS_2 signed
>> > +#define SIGNEDNESS_3 signed
>> > +#define SIGNEDNESS_4 unsigned
>> > +
>> > +SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
>> > +SIGNEDNESS_3 char *restrict a,
>> > +   SIGNEDNESS_4 char *restrict b)
>> > +{
>> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
>> > +    {
>> > +      int av = a[i];
>> > +      int bv = b[i];
>> > +      SIGNEDNESS_2 short mult = av * bv;
>> > +      res += mult;
>> > +    }
>> > +  return res;
>> > +}
>> > +
>> > +SIGNEDNESS_1 int __attribute__ ((noipa)) g (SIGNEDNESS_1 int res,
>> > +SIGNEDNESS_3 char *restrict b,
>> > +   SIGNEDNESS_4 char *restrict a)
>> > +{
>> > +  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
>> > +    {
>> > +      int av = a[i];
>> > +      int bv = b[i];
>> > +      SIGNEDNESS_2 short mult = av * bv;
>> > +      res += mult;
>> > +    }
>> > +  return res;
>> > +}
>> > +
>> > +/* { dg-final { scan-assembler-times {\tusdot\t} 2 } } */
>>
>> Guess this is personal preference, but I don't think the SIGNEDNESS_*
>> macros add anything when used like this.  I remember doing something
>> similar in the past when including .c files from other .c files(!) in order 
>> to
>> avoid cut-&-paste, but there doesn't seem much benefit for standalone files
>> like these.
>
> If it's the same to you, I do prefer this version, since it's identical to 
> the mid-end tests,
> It does allow when familiar with the  tests to just quickly see what it's 
> testing.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64-simd.md (aarch64_usdot<vsi2qi>): Rename to...
>         (usdot_prod<vsi2qi>): ... This.
>         * config/aarch64/aarch64-simd-builtins.def (usdot): Rename to...
>         (usdot_prod): ...This.
>         * config/aarch64/arm_neon.h (vusdot_s32, vusdotq_s32): Likewise.
>         * config/aarch64/aarch64-sve.md (@aarch64_<sur>dot_prod<vsi2qi>):
>         Rename to...
>         (@<sur>dot_prod<vsi2qi>): ...This.
>         * config/aarch64/aarch64-sve-builtins-base.cc
>         (svusdot_impl::expand): Use it.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/aarch64/simd/vusdot-autovec.c: New test.
>         * gcc.target/aarch64/sve/vusdot-autovec.c: New test.

OK, thanks.

Richard

Reply via email to