On 14/07/2025 10:35, Spencer Abson wrote:
On Mon, Jul 07, 2025 at 08:46:15AM +0000, Alfie Richards wrote:
Hello all,

This patch implements the couple of amin/amax instructions that are part of
SME2 + faminmax.

Regression testsed and bootstrapped for Aarch64.

Thanks,
Alfie

-- >8 --

Implements the sme2+faminmax svamin and svamax intrinsics.

gcc/ChangeLog:

        * config/aarch64/aarch64-sme.md (@aarch64_sme_<faminmax_uns_op><mode>):
        New patterns.
        * config/aarch64/aarch64-sve-builtins-sme.def (svamin): New intrinsics.
        (svamax): New intrinsics.
        * config/aarch64/aarch64-sve-builtins-sve2.cc (class faminmaximpl): New
        class.
        (svamin): New function.
        (svamax): New function.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c: New test.
        * gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c: New test.
---
  gcc/config/aarch64/aarch64-sme.md             |  18 +++
  .../aarch64/aarch64-sve-builtins-sme.def      |   5 +
  .../aarch64/aarch64-sve-builtins-sve2.cc      |  44 +++++-
  .../aarch64/sme2/acle-asm/amax_f16_x2.c       |  97 +++++++++++++
  .../aarch64/sme2/acle-asm/amax_f16_x4.c       | 128 +++++++++++++++++
  .../aarch64/sme2/acle-asm/amax_f32_x2.c       |  96 +++++++++++++
  .../aarch64/sme2/acle-asm/amax_f32_x4.c       | 129 ++++++++++++++++++
  .../aarch64/sme2/acle-asm/amax_f64_x2.c       |  96 +++++++++++++
  .../aarch64/sme2/acle-asm/amax_f64_x4.c       | 128 +++++++++++++++++
  .../aarch64/sme2/acle-asm/amin_f16_x2.c       |  96 +++++++++++++
  .../aarch64/sme2/acle-asm/amin_f16_x4.c       | 128 +++++++++++++++++
  .../aarch64/sme2/acle-asm/amin_f32_x2.c       |  96 +++++++++++++
  .../aarch64/sme2/acle-asm/amin_f32_x4.c       | 128 +++++++++++++++++
  .../aarch64/sme2/acle-asm/amin_f64_x2.c       |  96 +++++++++++++
  .../aarch64/sme2/acle-asm/amin_f64_x4.c       | 128 +++++++++++++++++
  15 files changed, 1409 insertions(+), 4 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c
  create mode 100644 
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c

diff --git a/gcc/config/aarch64/aarch64-sme.md 
b/gcc/config/aarch64/aarch64-sme.md
index b8bb4cc14b6..bfe368e80b5 100644
--- a/gcc/config/aarch64/aarch64-sme.md
+++ b/gcc/config/aarch64/aarch64-sme.md
@@ -38,6 +38,7 @@
  ;; ---- Binary arithmetic on ZA tile
  ;; ---- Binary arithmetic on ZA slice
  ;; ---- Binary arithmetic, writing to ZA slice
+;; ---- Absolute minimum/maximum
  ;;
  ;; == Ternary arithmetic
  ;; ---- [INT] Dot product
@@ -1264,6 +1265,23 @@ (define_insn "*aarch64_sme_single_<optab><mode>_plus"
    "<sme_int_op>\tza.<Vetype>[%w0, %1, vgx<vector_count>], %2, %3.<Vetype>"
  )

Sorry for picking up on this after it was committed, but:

+;; -------------------------------------------------------------------------
+;; ---- Absolute minimum/maximum
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - svamin (SME2+faminmax)
+;; - svamin (SME2+faminmax)

Even though these are currently exclusively used by the ACLE, I think we should
be listing the names of the instructions here.  It looks like the convention
would be to capitalise "+faminax" too.
Sure, that makes sense.

+;; -------------------------------------------------------------------------
+
+(define_insn "@aarch64_sme_<faminmax_uns_op><mode>"
+  [(set (match_operand:SVE_Fx24 0 "register_operand" "=Uw<vector_count>")
+       (unspec:SVE_Fx24 [(match_operand:SVE_Fx24 1 "register_operand" "%0")
+                         (match_operand:SVE_Fx24 2 "register_operand" 
"Uw<vector_count>")]
+        FAMINMAX_UNS))]
+  "TARGET_SME2 && TARGET_FAMINMAX"

The frontend is rightly ensuring that the associated intrinsics can only be
used within streaming mode functions; IIUC, this predication should be:

   "TARGET_STREAMING_SME2 && TARGET_FAMINMAX"
Agh, yeah you're right, I will clean this up with a new patch.


+  "<faminmax_uns_op>\t%0, %1, %2"
+)
+

Thanks,
Spencer





Reply via email to