On Mon, Jul 07, 2025 at 08:46:15AM +0000, Alfie Richards wrote:
> Hello all,
> 
> This patch implements the couple of amin/amax instructions that are part of
> SME2 + faminmax.
> 
> Regression testsed and bootstrapped for Aarch64.
> 
> Thanks,
> Alfie
> 
> -- >8 --
> 
> Implements the sme2+faminmax svamin and svamax intrinsics.
> 
> gcc/ChangeLog:
> 
>       * config/aarch64/aarch64-sme.md (@aarch64_sme_<faminmax_uns_op><mode>):
>       New patterns.
>       * config/aarch64/aarch64-sve-builtins-sme.def (svamin): New intrinsics.
>       (svamax): New intrinsics.
>       * config/aarch64/aarch64-sve-builtins-sve2.cc (class faminmaximpl): New
>       class.
>       (svamin): New function.
>       (svamax): New function.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c: New test.
>       * gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c: New test.
> ---
>  gcc/config/aarch64/aarch64-sme.md             |  18 +++
>  .../aarch64/aarch64-sve-builtins-sme.def      |   5 +
>  .../aarch64/aarch64-sve-builtins-sve2.cc      |  44 +++++-
>  .../aarch64/sme2/acle-asm/amax_f16_x2.c       |  97 +++++++++++++
>  .../aarch64/sme2/acle-asm/amax_f16_x4.c       | 128 +++++++++++++++++
>  .../aarch64/sme2/acle-asm/amax_f32_x2.c       |  96 +++++++++++++
>  .../aarch64/sme2/acle-asm/amax_f32_x4.c       | 129 ++++++++++++++++++
>  .../aarch64/sme2/acle-asm/amax_f64_x2.c       |  96 +++++++++++++
>  .../aarch64/sme2/acle-asm/amax_f64_x4.c       | 128 +++++++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f16_x2.c       |  96 +++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f16_x4.c       | 128 +++++++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f32_x2.c       |  96 +++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f32_x4.c       | 128 +++++++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f64_x2.c       |  96 +++++++++++++
>  .../aarch64/sme2/acle-asm/amin_f64_x4.c       | 128 +++++++++++++++++
>  15 files changed, 1409 insertions(+), 4 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c
> 
> diff --git a/gcc/config/aarch64/aarch64-sme.md 
> b/gcc/config/aarch64/aarch64-sme.md
> index b8bb4cc14b6..bfe368e80b5 100644
> --- a/gcc/config/aarch64/aarch64-sme.md
> +++ b/gcc/config/aarch64/aarch64-sme.md
> @@ -38,6 +38,7 @@
>  ;; ---- Binary arithmetic on ZA tile
>  ;; ---- Binary arithmetic on ZA slice
>  ;; ---- Binary arithmetic, writing to ZA slice
> +;; ---- Absolute minimum/maximum
>  ;;
>  ;; == Ternary arithmetic
>  ;; ---- [INT] Dot product
> @@ -1264,6 +1265,23 @@ (define_insn "*aarch64_sme_single_<optab><mode>_plus"
>    "<sme_int_op>\tza.<Vetype>[%w0, %1, vgx<vector_count>], %2, %3.<Vetype>"
>  )
>  

Sorry for picking up on this after it was committed, but:

> +;; -------------------------------------------------------------------------
> +;; ---- Absolute minimum/maximum
> +;; -------------------------------------------------------------------------
> +;; Includes:
> +;; - svamin (SME2+faminmax)
> +;; - svamin (SME2+faminmax)

Even though these are currently exclusively used by the ACLE, I think we should
be listing the names of the instructions here.  It looks like the convention
would be to capitalise "+faminax" too.

> +;; -------------------------------------------------------------------------
> +
> +(define_insn "@aarch64_sme_<faminmax_uns_op><mode>"
> +  [(set (match_operand:SVE_Fx24 0 "register_operand" "=Uw<vector_count>")
> +     (unspec:SVE_Fx24 [(match_operand:SVE_Fx24 1 "register_operand" "%0")
> +                       (match_operand:SVE_Fx24 2 "register_operand" 
> "Uw<vector_count>")]
> +      FAMINMAX_UNS))]
> +  "TARGET_SME2 && TARGET_FAMINMAX"

The frontend is rightly ensuring that the associated intrinsics can only be
used within streaming mode functions; IIUC, this predication should be:

  "TARGET_STREAMING_SME2 && TARGET_FAMINMAX"

> +  "<faminmax_uns_op>\t%0, %1, %2"
> +)
> +

Thanks,
Spencer




Reply via email to