On Mon, Jul 07, 2025 at 08:46:15AM +0000, Alfie Richards wrote: > Hello all, > > This patch implements the couple of amin/amax instructions that are part of > SME2 + faminmax. > > Regression testsed and bootstrapped for Aarch64. > > Thanks, > Alfie > > -- >8 -- > > Implements the sme2+faminmax svamin and svamax intrinsics. > > gcc/ChangeLog: > > * config/aarch64/aarch64-sme.md (@aarch64_sme_<faminmax_uns_op><mode>): > New patterns. > * config/aarch64/aarch64-sve-builtins-sme.def (svamin): New intrinsics. > (svamax): New intrinsics. > * config/aarch64/aarch64-sve-builtins-sve2.cc (class faminmaximpl): New > class. > (svamin): New function. > (svamax): New function. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c: New test. > * gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c: New test. > --- > gcc/config/aarch64/aarch64-sme.md | 18 +++ > .../aarch64/aarch64-sve-builtins-sme.def | 5 + > .../aarch64/aarch64-sve-builtins-sve2.cc | 44 +++++- > .../aarch64/sme2/acle-asm/amax_f16_x2.c | 97 +++++++++++++ > .../aarch64/sme2/acle-asm/amax_f16_x4.c | 128 +++++++++++++++++ > .../aarch64/sme2/acle-asm/amax_f32_x2.c | 96 +++++++++++++ > .../aarch64/sme2/acle-asm/amax_f32_x4.c | 129 ++++++++++++++++++ > .../aarch64/sme2/acle-asm/amax_f64_x2.c | 96 +++++++++++++ > .../aarch64/sme2/acle-asm/amax_f64_x4.c | 128 +++++++++++++++++ > .../aarch64/sme2/acle-asm/amin_f16_x2.c | 96 +++++++++++++ > .../aarch64/sme2/acle-asm/amin_f16_x4.c | 128 +++++++++++++++++ > .../aarch64/sme2/acle-asm/amin_f32_x2.c | 96 +++++++++++++ > .../aarch64/sme2/acle-asm/amin_f32_x4.c | 128 +++++++++++++++++ > .../aarch64/sme2/acle-asm/amin_f64_x2.c | 96 +++++++++++++ > .../aarch64/sme2/acle-asm/amin_f64_x4.c | 128 +++++++++++++++++ > 15 files changed, 1409 insertions(+), 4 deletions(-) > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f16_x4.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f32_x4.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amax_f64_x4.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f16_x4.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f32_x4.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x2.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/amin_f64_x4.c > > diff --git a/gcc/config/aarch64/aarch64-sme.md > b/gcc/config/aarch64/aarch64-sme.md > index b8bb4cc14b6..bfe368e80b5 100644 > --- a/gcc/config/aarch64/aarch64-sme.md > +++ b/gcc/config/aarch64/aarch64-sme.md > @@ -38,6 +38,7 @@ > ;; ---- Binary arithmetic on ZA tile > ;; ---- Binary arithmetic on ZA slice > ;; ---- Binary arithmetic, writing to ZA slice > +;; ---- Absolute minimum/maximum > ;; > ;; == Ternary arithmetic > ;; ---- [INT] Dot product > @@ -1264,6 +1265,23 @@ (define_insn "*aarch64_sme_single_<optab><mode>_plus" > "<sme_int_op>\tza.<Vetype>[%w0, %1, vgx<vector_count>], %2, %3.<Vetype>" > ) >
Sorry for picking up on this after it was committed, but: > +;; ------------------------------------------------------------------------- > +;; ---- Absolute minimum/maximum > +;; ------------------------------------------------------------------------- > +;; Includes: > +;; - svamin (SME2+faminmax) > +;; - svamin (SME2+faminmax) Even though these are currently exclusively used by the ACLE, I think we should be listing the names of the instructions here. It looks like the convention would be to capitalise "+faminax" too. > +;; ------------------------------------------------------------------------- > + > +(define_insn "@aarch64_sme_<faminmax_uns_op><mode>" > + [(set (match_operand:SVE_Fx24 0 "register_operand" "=Uw<vector_count>") > + (unspec:SVE_Fx24 [(match_operand:SVE_Fx24 1 "register_operand" "%0") > + (match_operand:SVE_Fx24 2 "register_operand" > "Uw<vector_count>")] > + FAMINMAX_UNS))] > + "TARGET_SME2 && TARGET_FAMINMAX" The frontend is rightly ensuring that the associated intrinsics can only be used within streaming mode functions; IIUC, this predication should be: "TARGET_STREAMING_SME2 && TARGET_FAMINMAX" > + "<faminmax_uns_op>\t%0, %1, %2" > +) > + Thanks, Spencer