The ACLE defines a new set of fp8 vector types and intrinsics that operate on these, some of them operating on the vectors as if they were bags of bits and some requiring an additional argument of type fpm_t.
The following patches introduce: - the types - intrinsics that operate without the fpm_t type - foundational changes that will be used to implement intrinsics requiring an fpm_t argument at the end - fp8 conversion intrinsics - fp8 multiply accumulate intrinsics Compared to v1 of this series, this version adds: - A change to fix return of scalar fp8 values - Added tests for sve<->simd conversions - Support for svcvt* intrinsics along with supporting shapes Compared to v2 of this series, this version: - Removes the first patch to fix return of scalar fp8 (already merged) - Uses b_data to add mf8 rather than TYPES_all_data directly - Updated test register matching with regex rather than hardcoded regs - fixed formatting in aarch64-sve-builtins-base.cc, aarch64-sve-builtins-sve2.cc, aarch64-sve-builtins.cc - removed fpm mode from DEF_SVE_FUNCTION_GS - added DEF_SVE_FUNCTION_GS_FPM - renamed unary_convert_narrowxn_fpm to unary_convertxn_narrowt - renamed unary_convertxn_fpm to unary_convertxn_narrow - use require_scalar_type rather than require_derived_scalar_type - moved emit_move_insn for fpmr into function_expander::expand - simplified instruction patterns - addressed style request from code review - Added fp8 multiply accumulate intrinsics Is this ok for master? I do not have commit rights yet, if ok, can someone commit it on my behalf? Regression tested on aarch64-unknown-linux-gnu. Thanks, Claudio Bantaloukas Claudio Bantaloukas (4): aarch64: Add basic svmfloat8_t support to arm_sve.h aarch64: specify fpm mode in function instances and groups aarch64: add svcvt* FP8 intrinsics aarch64: add SVE2 FP8 multiply accumulate intrinsics .../aarch64/aarch64-option-extensions.def | 4 + .../aarch64/aarch64-sve-builtins-base.cc | 21 +- .../aarch64/aarch64-sve-builtins-functions.h | 16 +- .../aarch64/aarch64-sve-builtins-shapes.cc | 164 +++++++++- .../aarch64/aarch64-sve-builtins-shapes.h | 4 + .../aarch64/aarch64-sve-builtins-sve2.cc | 101 ++++-- .../aarch64/aarch64-sve-builtins-sve2.def | 28 ++ .../aarch64/aarch64-sve-builtins-sve2.h | 14 + gcc/config/aarch64/aarch64-sve-builtins.cc | 71 ++++- gcc/config/aarch64/aarch64-sve-builtins.def | 11 +- gcc/config/aarch64/aarch64-sve-builtins.h | 28 +- gcc/config/aarch64/aarch64-sve2.md | 101 ++++++ gcc/config/aarch64/aarch64.h | 13 + gcc/config/aarch64/iterators.md | 55 ++++ gcc/doc/invoke.texi | 4 + .../aarch64/sve/acle/general-c++/mangle_1.C | 2 + .../aarch64/sve/acle/general-c++/mangle_2.C | 2 + .../aarch64/sve/acle/asm/clasta_mf8.c | 52 +++ .../aarch64/sve/acle/asm/clastb_mf8.c | 52 +++ .../aarch64/sve/acle/asm/create2_1.c | 15 + .../aarch64/sve/acle/asm/create3_1.c | 11 + .../aarch64/sve/acle/asm/create4_1.c | 12 + .../aarch64/sve/acle/asm/dup_lane_mf8.c | 124 ++++++++ .../gcc.target/aarch64/sve/acle/asm/dup_mf8.c | 31 ++ .../aarch64/sve/acle/asm/dup_neonq_mf8.c | 30 ++ .../aarch64/sve/acle/asm/dupq_lane_mf8.c | 48 +++ .../gcc.target/aarch64/sve/acle/asm/ext_mf8.c | 73 +++++ .../aarch64/sve/acle/asm/get2_mf8.c | 55 ++++ .../aarch64/sve/acle/asm/get3_mf8.c | 108 +++++++ .../aarch64/sve/acle/asm/get4_mf8.c | 179 +++++++++++ .../aarch64/sve/acle/asm/get_neonq_mf8.c | 33 ++ .../aarch64/sve/acle/asm/insr_mf8.c | 22 ++ .../aarch64/sve/acle/asm/lasta_mf8.c | 12 + .../aarch64/sve/acle/asm/lastb_mf8.c | 12 + .../gcc.target/aarch64/sve/acle/asm/ld1_mf8.c | 162 ++++++++++ .../aarch64/sve/acle/asm/ld1ro_mf8.c | 121 +++++++ .../aarch64/sve/acle/asm/ld1rq_mf8.c | 137 ++++++++ .../gcc.target/aarch64/sve/acle/asm/ld2_mf8.c | 204 ++++++++++++ .../gcc.target/aarch64/sve/acle/asm/ld3_mf8.c | 246 +++++++++++++++ .../gcc.target/aarch64/sve/acle/asm/ld4_mf8.c | 290 +++++++++++++++++ .../aarch64/sve/acle/asm/ldff1_mf8.c | 91 ++++++ .../aarch64/sve/acle/asm/ldnf1_mf8.c | 155 +++++++++ .../aarch64/sve/acle/asm/ldnt1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/len_mf8.c | 12 + .../aarch64/sve/acle/asm/reinterpret_bf16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f32.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f64.c | 17 + .../aarch64/sve/acle/asm/reinterpret_mf8.c | 297 ++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s32.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s64.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s8.c | 17 + .../aarch64/sve/acle/asm/reinterpret_u16.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u32.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u64.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u8.c | 28 ++ .../gcc.target/aarch64/sve/acle/asm/rev_mf8.c | 21 ++ .../gcc.target/aarch64/sve/acle/asm/sel_mf8.c | 30 ++ .../aarch64/sve/acle/asm/set2_mf8.c | 41 +++ .../aarch64/sve/acle/asm/set3_mf8.c | 63 ++++ .../aarch64/sve/acle/asm/set4_mf8.c | 87 +++++ .../aarch64/sve/acle/asm/set_neonq_mf8.c | 23 ++ .../aarch64/sve/acle/asm/splice_mf8.c | 33 ++ .../gcc.target/aarch64/sve/acle/asm/st1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/st2_mf8.c | 204 ++++++++++++ .../gcc.target/aarch64/sve/acle/asm/st3_mf8.c | 246 +++++++++++++++ .../gcc.target/aarch64/sve/acle/asm/st4_mf8.c | 290 +++++++++++++++++ .../aarch64/sve/acle/asm/stnt1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/tbl_mf8.c | 30 ++ .../aarch64/sve/acle/asm/test_sve_acle.h | 8 +- .../aarch64/sve/acle/asm/trn1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/trn1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/trn2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/trn2q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/undef2_1.c | 7 + .../aarch64/sve/acle/asm/undef3_1.c | 7 + .../aarch64/sve/acle/asm/undef4_1.c | 7 + .../gcc.target/aarch64/sve/acle/asm/undef_1.c | 7 + .../aarch64/sve/acle/asm/uzp1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/uzp1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/uzp2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/uzp2q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/zip1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/zip1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/zip2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/zip2q_mf8.c | 33 ++ .../acle/general-c/ternary_mfloat8_lane_1.c | 83 +++++ .../acle/general-c/ternary_mfloat8_opt_n_1.c | 60 ++++ .../acle/general-c/unary_convertxn_narrow_1.c | 60 ++++ .../general-c/unary_convertxn_narrowt_1.c | 38 +++ .../gcc.target/aarch64/sve/pcs/annotate_1.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_2.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_3.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_4.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_5.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_6.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_7.c | 8 + .../aarch64/sve/pcs/args_5_be_mf8.c | 63 ++++ .../aarch64/sve/pcs/args_5_le_mf8.c | 58 ++++ .../aarch64/sve/pcs/args_6_be_mf8.c | 71 +++++ .../aarch64/sve/pcs/args_6_le_mf8.c | 70 +++++ .../aarch64/sve/pcs/gnu_vectors_1.c | 12 +- .../aarch64/sve/pcs/gnu_vectors_2.c | 10 +- .../gcc.target/aarch64/sve/pcs/return_4.c | 21 +- .../aarch64/sve/pcs/return_4_1024.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_128.c | 21 +- .../aarch64/sve/pcs/return_4_2048.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_256.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_512.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5.c | 21 +- .../aarch64/sve/pcs/return_5_1024.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_128.c | 21 +- .../aarch64/sve/pcs/return_5_2048.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_256.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_512.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_6.c | 24 ++ .../aarch64/sve/pcs/return_6_1024.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_128.c | 19 ++ .../aarch64/sve/pcs/return_6_2048.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_256.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_512.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_7.c | 28 ++ .../gcc.target/aarch64/sve/pcs/return_8.c | 29 ++ .../gcc.target/aarch64/sve/pcs/return_9.c | 33 ++ .../aarch64/sve/pcs/varargs_2_mf8.c | 182 +++++++++++ .../aarch64/sve2/acle/asm/cvt_mf8.c | 46 +++ .../aarch64/sve2/acle/asm/cvtlt_mf8.c | 47 +++ .../aarch64/sve2/acle/asm/cvtn_mf8.c | 28 ++ .../aarch64/sve2/acle/asm/cvtnb_mf8.c | 18 ++ .../aarch64/sve2/acle/asm/cvtnt_mf8.c | 29 ++ .../aarch64/sve2/acle/asm/mlalb_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlalb_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/mlallbb_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlallbb_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/mlallbt_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlallbt_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/mlalltb_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlalltb_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/mlalltt_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlalltt_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/mlalt_lane_mf8.c | 88 ++++++ .../aarch64/sve2/acle/asm/mlalt_mf8.c | 75 +++++ .../aarch64/sve2/acle/asm/tbl2_mf8.c | 31 ++ .../aarch64/sve2/acle/asm/tbx_mf8.c | 37 +++ .../aarch64/sve2/acle/asm/whilerw_mf8.c | 50 +++ .../aarch64/sve2/acle/asm/whilewr_mf8.c | 50 +++ gcc/testsuite/lib/target-supports.exp | 3 +- 148 files changed, 7911 insertions(+), 93 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clasta_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clastb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ext_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/insr_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lasta_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lastb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/len_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rev_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sel_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/splice_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c -- 2.43.0