This series adds support for: - FEAT_SVE_B16B16 (contains both SVE and SME instructions) - FEAT_SME_B16B16 - FEAT_SME_F16F16 - FEAT_SME2p1
I've bundled them together because they depend on each other and because they need some common preparatory patches. They also depend on https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668717.html (and so some tests are likely to fail CI). Tested on aarch64-linux-gnu. I'll push when the prerequisites are approved (but no earlier than Monday). Richard Richard Sandiford (8): aarch64: Rework sme_2mode_function insns aarch64: Refactor SVE predicated-to-unpredicated splits aarch64: Rename some SME iterators aarch64: Fix the choice of unspec in two SME patterns aarch64: Add support for SVE_B16B16 aarch64: Add support for SME_F16F16 aarch64: Add support for SME_B16B16 aarch64: Add support for SME2p1 gcc/config/aarch64/aarch64-c.cc | 9 + .../aarch64/aarch64-option-extensions.def | 9 + gcc/config/aarch64/aarch64-sme.md | 414 ++++++++++++++---- .../aarch64/aarch64-sve-builtins-base.cc | 5 +- .../aarch64/aarch64-sve-builtins-functions.h | 15 +- .../aarch64/aarch64-sve-builtins-shapes.cc | 11 + .../aarch64/aarch64-sve-builtins-shapes.h | 1 + .../aarch64/aarch64-sve-builtins-sme.cc | 61 ++- .../aarch64/aarch64-sve-builtins-sme.def | 41 ++ gcc/config/aarch64/aarch64-sve-builtins-sme.h | 3 + .../aarch64/aarch64-sve-builtins-sve2.cc | 11 + .../aarch64/aarch64-sve-builtins-sve2.def | 32 ++ .../aarch64/aarch64-sve-builtins-sve2.h | 1 + gcc/config/aarch64/aarch64-sve-builtins.cc | 27 +- gcc/config/aarch64/aarch64-sve.md | 408 +++++++++-------- gcc/config/aarch64/aarch64-sve2.md | 101 ++++- gcc/config/aarch64/aarch64.cc | 6 +- gcc/config/aarch64/aarch64.h | 19 + gcc/config/aarch64/aarch64.md | 29 +- gcc/config/aarch64/iterators.md | 102 ++++- gcc/config/aarch64/predicates.md | 1 + gcc/doc/invoke.texi | 18 +- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 99 ++++- .../sme2/acle-asm/add_za16_bf16_vg1x2.c | 126 ++++++ .../sme2/acle-asm/add_za16_bf16_vg1x4.c | 141 ++++++ .../sme2/acle-asm/add_za16_f16_vg1x2.c | 126 ++++++ .../sme2/acle-asm/add_za16_f16_vg1x4.c | 141 ++++++ .../aarch64/sme2/acle-asm/clamp_bf16_x2.c | 98 +++++ .../aarch64/sme2/acle-asm/clamp_bf16_x4.c | 108 +++++ .../aarch64/sme2/acle-asm/cvt_f32_f16_x2.c | 54 +++ .../aarch64/sme2/acle-asm/cvtl_f32_f16_x2.c | 54 +++ .../aarch64/sme2/acle-asm/max_bf16_x2.c | 211 +++++++++ .../aarch64/sme2/acle-asm/max_bf16_x4.c | 253 +++++++++++ .../aarch64/sme2/acle-asm/maxnm_bf16_x2.c | 211 +++++++++ .../aarch64/sme2/acle-asm/maxnm_bf16_x4.c | 253 +++++++++++ .../aarch64/sme2/acle-asm/min_bf16_x2.c | 211 +++++++++ .../aarch64/sme2/acle-asm/min_bf16_x4.c | 253 +++++++++++ .../aarch64/sme2/acle-asm/minnm_bf16_x2.c | 211 +++++++++ .../aarch64/sme2/acle-asm/minnm_bf16_x4.c | 253 +++++++++++ .../sme2/acle-asm/mla_lane_za16_bf16_vg1x2.c | 106 +++++ .../sme2/acle-asm/mla_lane_za16_bf16_vg1x4.c | 112 +++++ .../sme2/acle-asm/mla_lane_za16_f16_vg1x2.c | 106 +++++ .../sme2/acle-asm/mla_lane_za16_f16_vg1x4.c | 112 +++++ .../sme2/acle-asm/mla_za16_bf16_vg1x2.c | 184 ++++++++ .../sme2/acle-asm/mla_za16_bf16_vg1x4.c | 176 ++++++++ .../sme2/acle-asm/mla_za16_f16_vg1x2.c | 184 ++++++++ .../sme2/acle-asm/mla_za16_f16_vg1x4.c | 176 ++++++++ .../sme2/acle-asm/mls_lane_za16_bf16_vg1x2.c | 106 +++++ .../sme2/acle-asm/mls_lane_za16_bf16_vg1x4.c | 112 +++++ .../sme2/acle-asm/mls_lane_za16_f16_vg1x2.c | 106 +++++ .../sme2/acle-asm/mls_lane_za16_f16_vg1x4.c | 112 +++++ .../sme2/acle-asm/mls_za16_bf16_vg1x2.c | 184 ++++++++ .../sme2/acle-asm/mls_za16_bf16_vg1x4.c | 176 ++++++++ .../sme2/acle-asm/mls_za16_f16_vg1x2.c | 184 ++++++++ .../sme2/acle-asm/mls_za16_f16_vg1x4.c | 176 ++++++++ .../aarch64/sme2/acle-asm/mopa_za16_bf16.c | 34 ++ .../aarch64/sme2/acle-asm/mopa_za16_f16.c | 34 ++ .../aarch64/sme2/acle-asm/mops_za16_bf16.c | 34 ++ .../aarch64/sme2/acle-asm/mops_za16_f16.c | 34 ++ .../aarch64/sme2/acle-asm/readz_hor_za128.c | 187 ++++++++ .../aarch64/sme2/acle-asm/readz_hor_za16.c | 127 ++++++ .../sme2/acle-asm/readz_hor_za16_vg2.c | 144 ++++++ .../sme2/acle-asm/readz_hor_za16_vg4.c | 142 ++++++ .../aarch64/sme2/acle-asm/readz_hor_za32.c | 137 ++++++ .../sme2/acle-asm/readz_hor_za32_vg2.c | 116 +++++ .../sme2/acle-asm/readz_hor_za32_vg4.c | 133 ++++++ .../aarch64/sme2/acle-asm/readz_hor_za64.c | 127 ++++++ .../sme2/acle-asm/readz_hor_za64_vg2.c | 117 +++++ .../sme2/acle-asm/readz_hor_za64_vg4.c | 133 ++++++ .../aarch64/sme2/acle-asm/readz_hor_za8.c | 87 ++++ .../aarch64/sme2/acle-asm/readz_hor_za8_vg2.c | 144 ++++++ .../aarch64/sme2/acle-asm/readz_hor_za8_vg4.c | 160 +++++++ .../aarch64/sme2/acle-asm/readz_ver_za16.c | 127 ++++++ .../sme2/acle-asm/readz_ver_za16_vg2.c | 144 ++++++ .../sme2/acle-asm/readz_ver_za16_vg4.c | 142 ++++++ .../aarch64/sme2/acle-asm/readz_ver_za32.c | 137 ++++++ .../sme2/acle-asm/readz_ver_za32_vg2.c | 116 +++++ .../sme2/acle-asm/readz_ver_za32_vg4.c | 133 ++++++ .../aarch64/sme2/acle-asm/readz_ver_za64.c | 127 ++++++ .../sme2/acle-asm/readz_ver_za64_vg2.c | 117 +++++ .../sme2/acle-asm/readz_ver_za64_vg4.c | 133 ++++++ .../aarch64/sme2/acle-asm/readz_ver_za8.c | 87 ++++ .../aarch64/sme2/acle-asm/readz_ver_za8_vg2.c | 144 ++++++ .../aarch64/sme2/acle-asm/readz_ver_za8_vg4.c | 160 +++++++ .../aarch64/sme2/acle-asm/readz_za16_vg1x2.c | 126 ++++++ .../aarch64/sme2/acle-asm/readz_za16_vg1x4.c | 141 ++++++ .../aarch64/sme2/acle-asm/readz_za32_vg1x2.c | 126 ++++++ .../aarch64/sme2/acle-asm/readz_za32_vg1x4.c | 141 ++++++ .../aarch64/sme2/acle-asm/readz_za64_vg1x2.c | 126 ++++++ .../aarch64/sme2/acle-asm/readz_za64_vg1x4.c | 141 ++++++ .../aarch64/sme2/acle-asm/readz_za8_vg1x2.c | 126 ++++++ .../aarch64/sme2/acle-asm/readz_za8_vg1x4.c | 141 ++++++ .../sme2/acle-asm/sub_za16_bf16_vg1x2.c | 126 ++++++ .../sme2/acle-asm/sub_za16_bf16_vg1x4.c | 141 ++++++ .../sme2/acle-asm/sub_za16_f16_vg1x2.c | 126 ++++++ .../sme2/acle-asm/sub_za16_f16_vg1x4.c | 141 ++++++ .../aarch64/sme2/acle-asm/zero_za64_vg1x2.c | 97 ++++ .../aarch64/sme2/acle-asm/zero_za64_vg1x4.c | 97 ++++ .../aarch64/sme2/acle-asm/zero_za64_vg2x1.c | 117 +++++ .../aarch64/sme2/acle-asm/zero_za64_vg2x2.c | 97 ++++ .../aarch64/sme2/acle-asm/zero_za64_vg2x4.c | 97 ++++ .../aarch64/sme2/acle-asm/zero_za64_vg4x1.c | 127 ++++++ .../aarch64/sme2/acle-asm/zero_za64_vg4x2.c | 97 ++++ .../aarch64/sme2/acle-asm/zero_za64_vg4x4.c | 97 ++++ .../aarch64/sve/acle/asm/test_sve_acle.h | 16 + .../gcc.target/aarch64/sve/bf16_arith_1.c | 10 + .../gcc.target/aarch64/sve/bf16_arith_1.h | 24 + .../gcc.target/aarch64/sve/bf16_arith_2.c | 8 + .../gcc.target/aarch64/sve/bf16_arith_3.c | 8 + .../gcc.target/aarch64/sve/cond_mla_9.c | 25 ++ gcc/testsuite/gcc.target/aarch64/sve/fmad_1.c | 9 +- gcc/testsuite/gcc.target/aarch64/sve/fmla_1.c | 9 +- gcc/testsuite/gcc.target/aarch64/sve/fmls_1.c | 9 +- gcc/testsuite/gcc.target/aarch64/sve/fmsb_1.c | 9 +- .../aarch64/sve2/acle/asm/add_bf16.c | 315 +++++++++++++ .../aarch64/sve2/acle/asm/clamp_bf16.c | 49 +++ .../aarch64/sve2/acle/asm/max_bf16.c | 301 +++++++++++++ .../aarch64/sve2/acle/asm/maxnm_bf16.c | 301 +++++++++++++ .../aarch64/sve2/acle/asm/min_bf16.c | 301 +++++++++++++ .../aarch64/sve2/acle/asm/minnm_bf16.c | 301 +++++++++++++ .../aarch64/sve2/acle/asm/mla_bf16.c | 341 +++++++++++++++ .../aarch64/sve2/acle/asm/mla_lane_bf16.c | 135 ++++++ .../aarch64/sve2/acle/asm/mls_bf16.c | 341 +++++++++++++++ .../aarch64/sve2/acle/asm/mls_lane_bf16.c | 135 ++++++ .../aarch64/sve2/acle/asm/mul_bf16.c | 315 +++++++++++++ .../aarch64/sve2/acle/asm/mul_lane_bf16.c | 121 +++++ .../aarch64/sve2/acle/asm/sub_bf16.c | 304 +++++++++++++ gcc/testsuite/lib/target-supports.exp | 3 +- 128 files changed, 15365 insertions(+), 349 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/add_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/add_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/add_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/add_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/cvt_f32_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/cvtl_f32_f16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/max_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/max_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/min_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/min_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_lane_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_lane_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_lane_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_lane_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mls_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mopa_za16_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mopa_za16_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mops_za16_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mops_za16_f16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za16_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za16_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za32_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za32_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za64_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za64_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za8_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_hor_za8_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za16_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za16_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za32_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za32_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za64_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za64_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za8_vg2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_ver_za8_vg4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za32_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za32_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za64_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za64_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za8_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/readz_za8_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/sub_za16_bf16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/sub_za16_bf16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/sub_za16_f16_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/sub_za16_f16_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg1x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg1x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg2x1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg2x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg2x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg4x1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg4x2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/zero_za64_vg4x4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/bf16_arith_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/bf16_arith_1.h create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/bf16_arith_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/bf16_arith_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cond_mla_9.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/add_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/clamp_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/max_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/maxnm_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/min_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/minnm_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mla_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mla_lane_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mls_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mls_lane_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mul_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mul_lane_bf16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sub_bf16.c -- 2.25.1