On Fri, Sep 01, 2017 at 02:22:17PM +0100, Tamar Christina wrote:
> Hi All,
> 
> This patch adds the instructions for Dot Product to AArch64 along
> with the intrinsics and vectorizer pattern.
> 
> Armv8.2-a dot product supports 8-bit element values both
> signed and unsigned.
> 
> Dot product is available from Arm8.2-a and onwards.
> 
> Regtested and bootstrapped on aarch64-none-elf and no issues.
> 
> Ok for trunk?
> 
> gcc/
> 2017-09-01  Tamar Christina  <tamar.christ...@arm.com>
> 
>       * config/aarch64/aarch64-builtins.c
>       (aarch64_types_quadopu_lane_qualifiers): New.
>       (TYPES_QUADOPU_LANE): New.
>       * config/aarch64/aarch64-simd.md (aarch64_<sur>dot<dot_mode>): New.
>       (<sur>dot_prod<dot_mode>, aarch64_<sur>dot_lane<dot_mode>): New.
>       (aarch64_<sur>dot_laneq<dot_mode>): New.
>       * config/aarch64/aarch64-simd-builtins.def (sdot, udot): New.
>       (sdot_lane, udot_lane, sdot_laneq, udot_laneq): New.
>       * config/aarch64/iterators.md (UNSPEC_SDOT, UNSPEC_UDOT): New.
>       (DOT_MODE, dot_mode, Vdottype, DOTPROD): New.
>       (sur): Add SDOT and UDOT.
> 
> -- 

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> f3e084f8778d70c82823b92fa80ff96021ad26db..21d46c84ab317c2d62afdf8c48117886aaf483b0
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -386,6 +386,87 @@
>  }
>  )
>  
> +;; These instructions map to the __builtins for the Dot Product operations.
> +(define_insn "aarch64_<sur>dot<dot_mode>"
> +  [(set (match_operand:VS 0 "register_operand" "=w")
> +     (unspec:VS [(match_operand:VS 1 "register_operand" "0")
> +                 (match_operand:<DOT_MODE> 2 "register_operand" "w")
> +                 (match_operand:<DOT_MODE> 3 "register_operand" "w")]
> +             DOTPROD))]
> +  "TARGET_DOTPROD"
> +  "<sur>dot\\t%0.<Vtype>, %2.<Vdottype>, %3.<Vdottype>"
> +  [(set_attr "type" "neon_dot")]

Would there be a small benefit in modelling this as:

  [(set (match_operand:VS 0 "register_operand" "=w")
        (add:VS ((match_operand:VS 1 "register_operand" "0")
                 (unsepc:VS [(match_operand:<DOT_MODE> 2 "register_operand" "w")
                    (match_operand:<DOT_MODE> 3 "register_operand" "w")]
                DOTPROD)))]


> +)
> +
> +;; These expands map to the Dot Product optab the vectorizer checks for.
> +;; The auto-vectorizer expects a dot product builtin that also does an
> +;; accumulation into the provided register.
> +;; Given the following pattern
> +;;
> +;; for (i=0; i<len; i++) {
> +;;     c = a[i] * b[i];
> +;;     r += c;
> +;; }
> +;; return result;
> +;;
> +;; This can be auto-vectorized to
> +;; r  = a[0]*b[0] + a[1]*b[1] + a[2]*b[2] + a[3]*b[3];
> +;;
> +;; given enough iterations.  However the vectorizer can keep unrolling the 
> loop
> +;; r += a[4]*b[4] + a[5]*b[5] + a[6]*b[6] + a[7]*b[7];
> +;; r += a[8]*b[8] + a[9]*b[9] + a[10]*b[10] + a[11]*b[11];
> +;; ...
> +;;
> +;; and so the vectorizer provides r, in which the result has to be 
> accumulated.
> +(define_expand "<sur>dot_prod<dot_mode>"
> +  [(set (match_operand:VS 0 "register_operand")
> +     (unspec:VS [(match_operand:<DOT_MODE> 1 "register_operand")
> +                 (match_operand:<DOT_MODE> 2 "register_operand")
> +                 (match_operand:VS 3 "register_operand")]
> +             DOTPROD))]

This is just an expand that always ends in a DONE, so doesn't need the
full description here, just:

  [(match_operand:VS 0 "register_operand)
   (match_operand:<DOT_MODE> 1 "register_operand")
   (match_operand:<DOT_MODE> 2 "register_operand")
   (match_operand:VS 3 "register_operand")]

> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index 
> cceb57525c7aa44933419bd317b1f03a7b76f4c4..533c12cca916669195e9b094527ee0de31542b12
>  100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -354,6 +354,8 @@
>      UNSPEC_SQRDMLSH     ; Used in aarch64-simd.md.
>      UNSPEC_FMAXNM       ; Used in aarch64-simd.md.
>      UNSPEC_FMINNM       ; Used in aarch64-simd.md.
> +    UNSPEC_SDOT              ; Used in aarch64-simd.md.
> +    UNSPEC_UDOT              ; Used in aarch64-simd.md.
>  ])
>  
>  ;; ------------------------------------------------------------------
> @@ -810,6 +812,13 @@
>  (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
>  (define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])
>  
> +;; Mapping attribute for Dot Product input modes based on result mode.
> +(define_mode_attr DOT_MODE [(V2SI "V8QI") (V4SI "V16QI")])
> +(define_mode_attr dot_mode [(V2SI "v8qi") (V4SI "v16qi")])

Are these not identical to the two lines above in the context?

>  (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi")])
>  (define_mode_attr VSI2QI [(V2SI "V8QI") (V4SI "V16QI")])

Thanks,
James

Reply via email to