On 5/2/23 17:28, Kyrylo Tkachov wrote:
-----Original Message----- From: Christophe Lyon <[email protected]> Sent: Tuesday, May 2, 2023 3:05 PM To: Kyrylo Tkachov <[email protected]>; [email protected]; Richard Earnshaw <[email protected]>; Richard Sandiford <[email protected]> Subject: Re: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq On 5/2/23 12:26, Kyrylo Tkachov wrote: Hi Christophe, -----Original Message----- From: Christophe Lyon <[email protected]> <mailto:[email protected]> Sent: Tuesday, April 18, 2023 2:46 PM To: [email protected] <mailto:gcc- [email protected]> ; Kyrylo Tkachov <[email protected]> <mailto:[email protected]> ; Richard Earnshaw <[email protected]> <mailto:[email protected]> ; Richard Sandiford <[email protected]> <mailto:[email protected]> Cc: Christophe Lyon <[email protected]> <mailto:[email protected]> Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq This patch implements vreinterpretq using the new MVE intrinsics framework. The old definitions for vreinterpretq are removed as a consequence. 2022-09-08 Murray Steele <[email protected]> <mailto:[email protected]> Christophe Lyon <[email protected]> <mailto:[email protected]> gcc/ * config/arm/arm-mve-builtins-base.cc (vreinterpretq_impl): New class. * config/arm/arm-mve-builtins-base.def: Define vreinterpretq. * config/arm/arm-mve-builtins-base.h (vreinterpretq): New declaration. * config/arm/arm-mve-builtins-shapes.cc (parse_element_type): New function. (parse_type): Likewise. (parse_signature): Likewise. (build_one): Likewise. (build_all): Likewise. (overloaded_base): New struct. (unary_convert_def): Likewise. * config/arm/arm-mve-builtins-shapes.h (unary_convert): Declare. * config/arm/arm-mve-builtins.cc (TYPES_reinterpret_signed1): New macro. (TYPES_reinterpret_unsigned1): Likewise. (TYPES_reinterpret_integer): Likewise. (TYPES_reinterpret_integer1): Likewise. (TYPES_reinterpret_float1): Likewise. (TYPES_reinterpret_float): Likewise. (reinterpret_integer): New. (reinterpret_float): New. (handle_arm_mve_h): Register builtins. * config/arm/arm_mve.h (vreinterpretq_s16): Remove. (vreinterpretq_s32): Likewise. (vreinterpretq_s64): Likewise. (vreinterpretq_s8): Likewise. (vreinterpretq_u16): Likewise. (vreinterpretq_u32): Likewise. (vreinterpretq_u64): Likewise. (vreinterpretq_u8): Likewise. (vreinterpretq_f16): Likewise. (vreinterpretq_f32): Likewise. (vreinterpretq_s16_s32): Likewise. (vreinterpretq_s16_s64): Likewise. (vreinterpretq_s16_s8): Likewise. (vreinterpretq_s16_u16): Likewise. (vreinterpretq_s16_u32): Likewise. (vreinterpretq_s16_u64): Likewise. (vreinterpretq_s16_u8): Likewise. (vreinterpretq_s32_s16): Likewise. (vreinterpretq_s32_s64): Likewise. (vreinterpretq_s32_s8): Likewise. (vreinterpretq_s32_u16): Likewise. (vreinterpretq_s32_u32): Likewise. (vreinterpretq_s32_u64): Likewise. (vreinterpretq_s32_u8): Likewise. (vreinterpretq_s64_s16): Likewise. (vreinterpretq_s64_s32): Likewise. (vreinterpretq_s64_s8): Likewise. (vreinterpretq_s64_u16): Likewise. (vreinterpretq_s64_u32): Likewise. (vreinterpretq_s64_u64): Likewise. (vreinterpretq_s64_u8): Likewise. (vreinterpretq_s8_s16): Likewise. (vreinterpretq_s8_s32): Likewise. (vreinterpretq_s8_s64): Likewise. (vreinterpretq_s8_u16): Likewise. (vreinterpretq_s8_u32): Likewise. (vreinterpretq_s8_u64): Likewise. (vreinterpretq_s8_u8): Likewise. (vreinterpretq_u16_s16): Likewise. (vreinterpretq_u16_s32): Likewise. (vreinterpretq_u16_s64): Likewise. (vreinterpretq_u16_s8): Likewise. (vreinterpretq_u16_u32): Likewise. (vreinterpretq_u16_u64): Likewise. (vreinterpretq_u16_u8): Likewise. (vreinterpretq_u32_s16): Likewise. (vreinterpretq_u32_s32): Likewise. (vreinterpretq_u32_s64): Likewise. (vreinterpretq_u32_s8): Likewise. (vreinterpretq_u32_u16): Likewise. (vreinterpretq_u32_u64): Likewise. (vreinterpretq_u32_u8): Likewise. (vreinterpretq_u64_s16): Likewise. (vreinterpretq_u64_s32): Likewise. (vreinterpretq_u64_s64): Likewise. (vreinterpretq_u64_s8): Likewise. (vreinterpretq_u64_u16): Likewise. (vreinterpretq_u64_u32): Likewise. (vreinterpretq_u64_u8): Likewise. (vreinterpretq_u8_s16): Likewise. (vreinterpretq_u8_s32): Likewise. (vreinterpretq_u8_s64): Likewise. (vreinterpretq_u8_s8): Likewise. (vreinterpretq_u8_u16): Likewise. (vreinterpretq_u8_u32): Likewise. (vreinterpretq_u8_u64): Likewise. (vreinterpretq_s32_f16): Likewise. (vreinterpretq_s32_f32): Likewise. (vreinterpretq_u16_f16): Likewise. (vreinterpretq_u16_f32): Likewise. (vreinterpretq_u32_f16): Likewise. (vreinterpretq_u32_f32): Likewise. (vreinterpretq_u64_f16): Likewise. (vreinterpretq_u64_f32): Likewise. (vreinterpretq_u8_f16): Likewise. (vreinterpretq_u8_f32): Likewise. (vreinterpretq_f16_f32): Likewise. (vreinterpretq_f16_s16): Likewise. (vreinterpretq_f16_s32): Likewise. (vreinterpretq_f16_s64): Likewise. (vreinterpretq_f16_s8): Likewise. (vreinterpretq_f16_u16): Likewise. (vreinterpretq_f16_u32): Likewise. (vreinterpretq_f16_u64): Likewise. (vreinterpretq_f16_u8): Likewise. (vreinterpretq_f32_f16): Likewise. (vreinterpretq_f32_s16): Likewise. (vreinterpretq_f32_s32): Likewise. (vreinterpretq_f32_s64): Likewise. (vreinterpretq_f32_s8): Likewise. (vreinterpretq_f32_u16): Likewise. (vreinterpretq_f32_u32): Likewise. (vreinterpretq_f32_u64): Likewise. (vreinterpretq_f32_u8): Likewise. (vreinterpretq_s16_f16): Likewise. (vreinterpretq_s16_f32): Likewise. (vreinterpretq_s64_f16): Likewise. (vreinterpretq_s64_f32): Likewise. (vreinterpretq_s8_f16): Likewise. (vreinterpretq_s8_f32): Likewise. (__arm_vreinterpretq_f16): Likewise. (__arm_vreinterpretq_f32): Likewise. (__arm_vreinterpretq_s16): Likewise. (__arm_vreinterpretq_s32): Likewise. (__arm_vreinterpretq_s64): Likewise. (__arm_vreinterpretq_s8): Likewise. (__arm_vreinterpretq_u16): Likewise. (__arm_vreinterpretq_u32): Likewise. (__arm_vreinterpretq_u64): Likewise. (__arm_vreinterpretq_u8): Likewise. * config/arm/arm_mve_types.h (__arm_vreinterpretq_s16_s32): Remove. (__arm_vreinterpretq_s16_s64): Likewise. (__arm_vreinterpretq_s16_s8): Likewise. (__arm_vreinterpretq_s16_u16): Likewise. (__arm_vreinterpretq_s16_u32): Likewise. (__arm_vreinterpretq_s16_u64): Likewise. (__arm_vreinterpretq_s16_u8): Likewise. (__arm_vreinterpretq_s32_s16): Likewise. (__arm_vreinterpretq_s32_s64): Likewise. (__arm_vreinterpretq_s32_s8): Likewise. (__arm_vreinterpretq_s32_u16): Likewise. (__arm_vreinterpretq_s32_u32): Likewise. (__arm_vreinterpretq_s32_u64): Likewise. (__arm_vreinterpretq_s32_u8): Likewise. (__arm_vreinterpretq_s64_s16): Likewise. (__arm_vreinterpretq_s64_s32): Likewise. (__arm_vreinterpretq_s64_s8): Likewise. (__arm_vreinterpretq_s64_u16): Likewise. (__arm_vreinterpretq_s64_u32): Likewise. (__arm_vreinterpretq_s64_u64): Likewise. (__arm_vreinterpretq_s64_u8): Likewise. (__arm_vreinterpretq_s8_s16): Likewise. (__arm_vreinterpretq_s8_s32): Likewise. (__arm_vreinterpretq_s8_s64): Likewise. (__arm_vreinterpretq_s8_u16): Likewise. (__arm_vreinterpretq_s8_u32): Likewise. (__arm_vreinterpretq_s8_u64): Likewise. (__arm_vreinterpretq_s8_u8): Likewise. (__arm_vreinterpretq_u16_s16): Likewise. (__arm_vreinterpretq_u16_s32): Likewise. (__arm_vreinterpretq_u16_s64): Likewise. (__arm_vreinterpretq_u16_s8): Likewise. (__arm_vreinterpretq_u16_u32): Likewise. (__arm_vreinterpretq_u16_u64): Likewise. (__arm_vreinterpretq_u16_u8): Likewise. (__arm_vreinterpretq_u32_s16): Likewise. (__arm_vreinterpretq_u32_s32): Likewise. (__arm_vreinterpretq_u32_s64): Likewise. (__arm_vreinterpretq_u32_s8): Likewise. (__arm_vreinterpretq_u32_u16): Likewise. (__arm_vreinterpretq_u32_u64): Likewise. (__arm_vreinterpretq_u32_u8): Likewise. (__arm_vreinterpretq_u64_s16): Likewise. (__arm_vreinterpretq_u64_s32): Likewise. (__arm_vreinterpretq_u64_s64): Likewise. (__arm_vreinterpretq_u64_s8): Likewise. (__arm_vreinterpretq_u64_u16): Likewise. (__arm_vreinterpretq_u64_u32): Likewise. (__arm_vreinterpretq_u64_u8): Likewise. (__arm_vreinterpretq_u8_s16): Likewise. (__arm_vreinterpretq_u8_s32): Likewise. (__arm_vreinterpretq_u8_s64): Likewise. (__arm_vreinterpretq_u8_s8): Likewise. (__arm_vreinterpretq_u8_u16): Likewise. (__arm_vreinterpretq_u8_u32): Likewise. (__arm_vreinterpretq_u8_u64): Likewise. (__arm_vreinterpretq_s32_f16): Likewise. (__arm_vreinterpretq_s32_f32): Likewise. (__arm_vreinterpretq_s16_f16): Likewise. (__arm_vreinterpretq_s16_f32): Likewise. (__arm_vreinterpretq_s64_f16): Likewise. (__arm_vreinterpretq_s64_f32): Likewise. (__arm_vreinterpretq_s8_f16): Likewise. (__arm_vreinterpretq_s8_f32): Likewise. (__arm_vreinterpretq_u16_f16): Likewise. (__arm_vreinterpretq_u16_f32): Likewise. (__arm_vreinterpretq_u32_f16): Likewise. (__arm_vreinterpretq_u32_f32): Likewise. (__arm_vreinterpretq_u64_f16): Likewise. (__arm_vreinterpretq_u64_f32): Likewise. (__arm_vreinterpretq_u8_f16): Likewise. (__arm_vreinterpretq_u8_f32): Likewise. (__arm_vreinterpretq_f16_f32): Likewise. (__arm_vreinterpretq_f16_s16): Likewise. (__arm_vreinterpretq_f16_s32): Likewise. (__arm_vreinterpretq_f16_s64): Likewise. (__arm_vreinterpretq_f16_s8): Likewise. (__arm_vreinterpretq_f16_u16): Likewise. (__arm_vreinterpretq_f16_u32): Likewise. (__arm_vreinterpretq_f16_u64): Likewise. (__arm_vreinterpretq_f16_u8): Likewise. (__arm_vreinterpretq_f32_f16): Likewise. (__arm_vreinterpretq_f32_s16): Likewise. (__arm_vreinterpretq_f32_s32): Likewise. (__arm_vreinterpretq_f32_s64): Likewise. (__arm_vreinterpretq_f32_s8): Likewise. (__arm_vreinterpretq_f32_u16): Likewise. (__arm_vreinterpretq_f32_u32): Likewise. (__arm_vreinterpretq_f32_u64): Likewise. (__arm_vreinterpretq_f32_u8): Likewise. (__arm_vreinterpretq_s16): Likewise. (__arm_vreinterpretq_s32): Likewise. (__arm_vreinterpretq_s64): Likewise. (__arm_vreinterpretq_s8): Likewise. (__arm_vreinterpretq_u16): Likewise. (__arm_vreinterpretq_u32): Likewise. (__arm_vreinterpretq_u64): Likewise. (__arm_vreinterpretq_u8): Likewise. (__arm_vreinterpretq_f16): Likewise. (__arm_vreinterpretq_f32): Likewise. * config/arm/mve.md (@arm_mve_reinterpret<mode>): New pattern. * config/arm/unspecs.md: (REINTERPRET): New unspec. gcc/testsuite/ * g++.target/arm/mve.exp: Add general-c++ and general directories. * g++.target/arm/mve/general-c++/nomve_fp_1.c: New test. * g++.target/arm/mve/general-c++/vreinterpretq_1.C: New test. * gcc.target/arm/mve/general-c/nomve_fp_1.c: New test. * gcc.target/arm/mve/general-c/vreinterpretq_1.c: New test. --- gcc/config/arm/arm-mve-builtins-base.cc | 29 + gcc/config/arm/arm-mve-builtins-base.def | 2 + gcc/config/arm/arm-mve-builtins-base.h | 2 + gcc/config/arm/arm-mve-builtins-shapes.cc | 28 + gcc/config/arm/arm-mve-builtins-shapes.h | 8 + gcc/config/arm/arm-mve-builtins.cc | 60 + gcc/config/arm/arm_mve.h | 300 ---- gcc/config/arm/arm_mve_types.h | 1365 +------------- --- gcc/config/arm/mve.md | 18 + gcc/config/arm/unspecs.md | 1 + gcc/testsuite/g++.target/arm/mve.exp | 8 +- .../arm/mve/general-c++/nomve_fp_1.c | 15 + .../arm/mve/general-c++/vreinterpretq_1.C | 25 + .../gcc.target/arm/mve/general-c/nomve_fp_1.c | 15 + .../arm/mve/general-c/vreinterpretq_1.c | 25 + 15 files changed, 286 insertions(+), 1615 deletions(-) create mode 100644 gcc/testsuite/g++.target/arm/mve/general- c++/nomve_fp_1.c create mode 100644 gcc/testsuite/g++.target/arm/mve/general- c++/vreinterpretq_1.C create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- c/nomve_fp_1.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/general- c/vreinterpretq_1.c diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm- mve-builtins-base.cc index e9f285faf2b..ad8d500afc6 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -38,8 +38,37 @@ using namespace arm_mve; namespace { +/* Implements vreinterpretq_* intrinsics. */ +class vreinterpretq_impl : public quiet<function_base> +{ + gimple * + fold (gimple_folder &f) const override + { + /* Punt to rtl if the effect of the reinterpret on registers does not + conform to GCC's endianness model. */ + if (!targetm.can_change_mode_class (f.vector_mode (0), + f.vector_mode (1), VFP_REGS)) + return NULL; + So we punt to an RTL pattern here if we cannot change mode class... [snip] diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 35eab6c94bf..ab688396f97 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -10561,3 +10561,21 @@ (define_expand "vcond_mask_<mode><MVE_vpred>" } DONE; }) + +;; Reinterpret operand 1 in operand 0's mode, without changing its contents. +(define_expand "@arm_mve_reinterpret<mode>" + [(set (match_operand:MVE_vecs 0 "register_operand") + (unspec:MVE_vecs + [(match_operand 1 "arm_any_register_operand")] + REINTERPRET))] + "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode)) + || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))" + { + machine_mode src_mode = GET_MODE (operands[1]); + if (targetm.can_change_mode_class (<MODE>mode, src_mode, VFP_REGS)) + { + emit_move_insn (operands[0], gen_lowpart (<MODE>mode, operands[1])); + DONE; + } + } +) ... But we still check can_change_mode_class in this pattern and if it's not true we emit the new REINTERPRET unspec without a corresponding define_insn pattern. Won't that ICE? Would this case occur on big-endian targets? Looks like you are right. However, arm_mve.h is protected by: #if __ARM_BIG_ENDIAN #error "MVE intrinsics are not supported in Big-Endian mode." Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE. In fact, this pattern and vreinterpretq_impl above are quite similar to the aarch64 implementation. I tried with a sample svint16_t foo(svint8_t value1) { return svreinterpret_s16_s8(value1); } and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK, although aarch64_can_change_mode_class() has: if (BYTES_BIG_ENDIAN) ... if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE (to)) return false; so it should have a similar problem? I', not sure why it doesn't ICE?I believe that's because there's a pattern in aarch64-sve.md that converts everything into a simple set with the right modes forced in. ;; A pattern for handling type punning on big-endian targets. We use a ;; special predicate for operand 1 to reduce the number of patterns. (define_insn_and_split "*aarch64_sve_reinterpret<mode>" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand 1 "aarch64_any_register_operand" "w")] UNSPEC_REINTERPRET))] "TARGET_SVE" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 1))] { operands[1] = aarch64_replace_reg_mode (operands[1], <MODE>mode); } )
Ha, right, thanks.
I guess since we don't claim to support big-endian MVE for now we probably don't need to handle it, but I wonder whether we should instead be asserting that targetm.can_change_mode_class is true in the folding code and adding a comment that it for future big-endian support it should be handled properly in the .md file as on aarch64?
Sure, I can easily add an assert, will do for v2. Thanks, Christophe
Thanks, KyrillThanks, Christophe Thanks, Kyrill diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index 84384ee798d..dccda283573 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [ SQRSHRL_64 SQRSHRL_48 VSHLCQ_M_ + REINTERPRET ])
