Tamar Christina <[email protected]> writes: >> -----Original Message----- >> From: Claudio Bantaloukas <[email protected]> >> Sent: 24 November 2025 18:01 >> To: Gcc Patches ML <[email protected]> >> Cc: Alex Coplan <[email protected]>; Alice Carlotti >> <[email protected]>; Andrew Pinski >> <[email protected]>; Kyrylo Tkachov >> <[email protected]>; Richard Earnshaw <[email protected]>; >> Tamar Christina <[email protected]>; Wilco Dijkstra >> <[email protected]>; Claudio Bantaloukas >> <[email protected]> >> Subject: [PATCH v3 2/9] aarch64: extend sme intrinsics to mfp8 >> >> >> This patch extends the following intrinsics to support svmfloat8_t types and >> adds tests based on the equivalent ones for svuint8_t. >> >> SME: >> - svread_hor_za8[_mf8]_m, svread_hor_za128[_mf8]_m and related ver. >> - svwrite_hor_za8[_mf8]_m, svwrite_hor_za128[_mf8]_m and related ver. >> >> SME2: >> - svread_hor_za8_mf8_vg2, svread_hor_za8_mf8_vg4 and related ver. >> - svwrite_hor_za8[_mf8]_vg2, svwrite_hor_za8[_mf8]_vg4 and related ver. >> - svread_za8[_mf8]_vg1x2, svread_za8[_mf8]_vg1x4. >> - svwrite_za8[_mf8]_vg1x2, svwrite_za8[_mf8]_vg1x4. >> - svsel[_mf8_x2], svsel[_mf8_x4]. >> - svzip[_mf8_x2], svzip[_mf8_x4]. >> - svzipq[_mf8_x2], svzipq[_mf8_x4]. >> - svuzp[_mf8_x2], svuzp[_mf8_x4]. >> - svuzpq[_mf8_x2], svuzpq[_mf8_x4]. >> - svld1[_mf8]_x2, svld1[_mf8]_x4. >> - svld1_vnum[_mf8]_x2, svld1_vnum[_mf8]_x4. >> >> SVE2.1/SME2: >> - svldnt1[_mf8]_x2, svldnt1[_mf8]_x4. >> - svldnt1_vnum[_mf8]_x2, svldnt1_vnum[_mf8]_x4. >> - svrevd[_mf8]_m, svrevd[_mf8]_z, svrevd[_mf8]_x. >> - svst1[_mf8_x2], svst1[_mf8_x4]. >> - svst1_vnum[_mf8_x2], svst1_vnum[_mf8_x4]. >> - svstnt1[_mf8_x2], svstnt1[_mf8_x4]. >> - svstnt1_vnum[_mf8_x2], svstnt1_vnum[_mf8_x4]. >> >> SME2.1: >> - svreadz_hor_za8_u8, svreadz_hor_za8_u8_vg2, svreadz_hor_za8_u8_vg4 >> and related >> ver. >> - svreadz_hor_za128_u8, svreadz_ver_za128_u8. >> - svreadz_za8_u8_vg1x2, svreadz_za8_u8_vg1x4. >> >> This change follows ACLE 2024Q4. > > OK. > > For some of these tests I think we could have simplified the regexp by > disabling instruction scheduling so we don't need all the |, but that > seems to be convention here so just something for next time.
Late reply, but FWIW, the harness does this for us (i.e. adds -fno-schedule-insns* for -DCHECK_ASM). A lot of the residual regexp flexibility is defensive, to reduce the number of mass updates. E.g. the move regexps accept so many forms because there was a spell when we changed move choices quite often. And at least in the tests I wrote, the multi-insn | sequences were also defensive. They were supposed to match all the acceptable minimum-instruction expansions that I could think of (if GCC did manage to generate the minimum number), rather than just the expansion that GCC happened to pick. Thanks, Richard > Thanks, > Tamar > >> >> gcc/ >> * config/aarch64/aarch64-sve-builtins.cc (TYPES_za_bhsd_data): Add >> D (za8, mf8) combination to za_bhsd_data. >> >> gcc/testsuite/ >> * gcc.target/aarch64/sme/acle-asm/revd_mf8.c: Added test file. >> * gcc.target/aarch64/sme2/acle-asm/ld1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/ld1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/ldnt1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/ldnt1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_ver_za128.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/sel_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/sel_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/st1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/st1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/stnt1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/stnt1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/uzp_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/uzp_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/uzpq_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/uzpq_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/zip_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/zip_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/zipq_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/zipq_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/ld1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/ld1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/ldnt1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/ldnt1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/revd_mf8.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/stnt1_mf8_x2.c: Likewise. >> * gcc.target/aarch64/sve2/acle/asm/stnt1_mf8_x4.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/read_hor_za128.c: Added mf8 >> tests. >> * gcc.target/aarch64/sme/acle-asm/read_hor_za8.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/read_ver_za128.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/read_ver_za8.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/write_hor_za128.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/write_hor_za8.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/write_ver_za128.c: Likewise. >> * gcc.target/aarch64/sme/acle-asm/write_ver_za8.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_hor_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_hor_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_ver_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_ver_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_za8_vg1x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/read_za8_vg1x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_hor_za128.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_hor_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_hor_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_hor_za8.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_ver_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_ver_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_ver_za8.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_za8_vg1x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/readz_za8_vg1x4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_hor_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_hor_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_ver_za8_vg2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_ver_za8_vg4.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_za8_vg1x2.c: Likewise. >> * gcc.target/aarch64/sme2/acle-asm/write_za8_vg1x4.c: Likewise. >> --- >> gcc/config/aarch64/aarch64-sve-builtins.cc | 4 +- >> .../aarch64/sme/acle-asm/read_hor_za128.c | 31 ++ >> .../aarch64/sme/acle-asm/read_hor_za8.c | 31 ++ >> .../aarch64/sme/acle-asm/read_ver_za128.c | 31 ++ >> .../aarch64/sme/acle-asm/read_ver_za8.c | 31 ++ >> .../aarch64/sme/acle-asm/revd_mf8.c | 76 ++++ >> .../aarch64/sme/acle-asm/write_hor_za128.c | 10 + >> .../aarch64/sme/acle-asm/write_hor_za8.c | 10 + >> .../aarch64/sme/acle-asm/write_ver_za128.c | 10 + >> .../aarch64/sme/acle-asm/write_ver_za8.c | 10 + >> .../aarch64/sme2/acle-asm/ld1_mf8_x2.c | 262 +++++++++++++ >> .../aarch64/sme2/acle-asm/ld1_mf8_x4.c | 354 +++++++++++++++++ >> .../aarch64/sme2/acle-asm/ldnt1_mf8_x2.c | 262 +++++++++++++ >> .../aarch64/sme2/acle-asm/ldnt1_mf8_x4.c | 354 +++++++++++++++++ >> .../aarch64/sme2/acle-asm/read_hor_za8_vg2.c | 78 ++++ >> .../aarch64/sme2/acle-asm/read_hor_za8_vg4.c | 91 +++++ >> .../aarch64/sme2/acle-asm/read_ver_za8_vg2.c | 78 ++++ >> .../aarch64/sme2/acle-asm/read_ver_za8_vg4.c | 91 +++++ >> .../aarch64/sme2/acle-asm/read_za8_vg1x2.c | 48 +++ >> .../aarch64/sme2/acle-asm/read_za8_vg1x4.c | 54 +++ >> .../aarch64/sme2/acle-asm/readz_hor_za128.c | 10 + >> .../aarch64/sme2/acle-asm/readz_hor_za8.c | 10 + >> .../aarch64/sme2/acle-asm/readz_hor_za8_vg2.c | 78 ++++ >> .../aarch64/sme2/acle-asm/readz_hor_za8_vg4.c | 91 +++++ >> .../aarch64/sme2/acle-asm/readz_ver_za128.c | 197 ++++++++++ >> .../aarch64/sme2/acle-asm/readz_ver_za8.c | 10 + >> .../aarch64/sme2/acle-asm/readz_ver_za8_vg2.c | 77 ++++ >> .../aarch64/sme2/acle-asm/readz_ver_za8_vg4.c | 90 +++++ >> .../aarch64/sme2/acle-asm/readz_za8_vg1x2.c | 48 +++ >> .../aarch64/sme2/acle-asm/readz_za8_vg1x4.c | 56 +++ >> .../aarch64/sme2/acle-asm/sel_mf8_x2.c | 92 +++++ >> .../aarch64/sme2/acle-asm/sel_mf8_x4.c | 92 +++++ >> .../aarch64/sme2/acle-asm/st1_mf8_x2.c | 262 +++++++++++++ >> .../aarch64/sme2/acle-asm/st1_mf8_x4.c | 354 +++++++++++++++++ >> .../aarch64/sme2/acle-asm/stnt1_mf8_x2.c | 262 +++++++++++++ >> .../aarch64/sme2/acle-asm/stnt1_mf8_x4.c | 354 +++++++++++++++++ >> .../aarch64/sme2/acle-asm/uzp_mf8_x2.c | 77 ++++ >> .../aarch64/sme2/acle-asm/uzp_mf8_x4.c | 73 ++++ >> .../aarch64/sme2/acle-asm/uzpq_mf8_x2.c | 77 ++++ >> .../aarch64/sme2/acle-asm/uzpq_mf8_x4.c | 73 ++++ >> .../aarch64/sme2/acle-asm/write_hor_za8_vg2.c | 78 ++++ >> .../aarch64/sme2/acle-asm/write_hor_za8_vg4.c | 91 +++++ >> .../aarch64/sme2/acle-asm/write_ver_za8_vg2.c | 78 ++++ >> .../aarch64/sme2/acle-asm/write_ver_za8_vg4.c | 91 +++++ >> .../aarch64/sme2/acle-asm/write_za8_vg1x2.c | 48 +++ >> .../aarch64/sme2/acle-asm/write_za8_vg1x4.c | 54 +++ >> .../aarch64/sme2/acle-asm/zip_mf8_x2.c | 77 ++++ >> .../aarch64/sme2/acle-asm/zip_mf8_x4.c | 73 ++++ >> .../aarch64/sme2/acle-asm/zipq_mf8_x2.c | 77 ++++ >> .../aarch64/sme2/acle-asm/zipq_mf8_x4.c | 73 ++++ >> .../aarch64/sve2/acle/asm/ld1_mf8_x2.c | 269 +++++++++++++ >> .../aarch64/sve2/acle/asm/ld1_mf8_x4.c | 361 ++++++++++++++++++ >> .../aarch64/sve2/acle/asm/ldnt1_mf8_x2.c | 269 +++++++++++++ >> .../aarch64/sve2/acle/asm/ldnt1_mf8_x4.c | 361 ++++++++++++++++++ >> .../aarch64/sve2/acle/asm/revd_mf8.c | 80 ++++ >> .../aarch64/sve2/acle/asm/stnt1_mf8_x2.c | 269 +++++++++++++ >> .../aarch64/sve2/acle/asm/stnt1_mf8_x4.c | 361 ++++++++++++++++++ >> 57 files changed, 7007 insertions(+), 2 deletions(-) >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle- >> asm/revd_mf8.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/ld1_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/ld1_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/ldnt1_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/ldnt1_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/readz_ver_za128.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/sel_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/sel_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/st1_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/st1_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/stnt1_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/stnt1_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/uzp_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/uzp_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/uzpq_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/uzpq_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/zip_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/zip_mf8_x4.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/zipq_mf8_x2.c >> create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle- >> asm/zipq_mf8_x4.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_mf8_x2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ld1_mf8_x4.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_mf8_x2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_mf8_x4.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/revd_mf8.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_mf8_x2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_mf8_x4.c
