> -----Original Message----- > From: Christophe Lyon <christophe.l...@arm.com> > Sent: Tuesday, May 2, 2023 3:05 PM > To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-patches@gcc.gnu.org; > Richard Earnshaw <richard.earns...@arm.com>; Richard Sandiford > <richard.sandif...@arm.com> > Subject: Re: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq > > > > > On 5/2/23 12:26, Kyrylo Tkachov wrote: > > > Hi Christophe, > > > -----Original Message----- > From: Christophe Lyon <christophe.l...@arm.com> > <mailto:christophe.l...@arm.com> > Sent: Tuesday, April 18, 2023 2:46 PM > To: gcc-patches@gcc.gnu.org <mailto:gcc- > patc...@gcc.gnu.org> ; Kyrylo Tkachov <kyrylo.tkac...@arm.com> > <mailto:kyrylo.tkac...@arm.com> ; > Richard Earnshaw <richard.earns...@arm.com> > <mailto:richard.earns...@arm.com> ; Richard Sandiford > <richard.sandif...@arm.com> > <mailto:richard.sandif...@arm.com> > Cc: Christophe Lyon <christophe.l...@arm.com> > <mailto:christophe.l...@arm.com> > Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework > vreinterpretq > > This patch implements vreinterpretq using the new MVE > intrinsics > framework. > > The old definitions for vreinterpretq are removed as a > consequence. > > 2022-09-08 Murray Steele <murray.ste...@arm.com> > <mailto:murray.ste...@arm.com> > Christophe Lyon <christophe.l...@arm.com> > <mailto:christophe.l...@arm.com> > > gcc/ > * config/arm/arm-mve-builtins-base.cc > (vreinterpretq_impl): New > class. > * config/arm/arm-mve-builtins-base.def: Define > vreinterpretq. > * config/arm/arm-mve-builtins-base.h > (vreinterpretq): New > declaration. > * config/arm/arm-mve-builtins-shapes.cc > (parse_element_type): New > function. > (parse_type): Likewise. > (parse_signature): Likewise. > (build_one): Likewise. > (build_all): Likewise. > (overloaded_base): New struct. > (unary_convert_def): Likewise. > * config/arm/arm-mve-builtins-shapes.h > (unary_convert): Declare. > * config/arm/arm-mve-builtins.cc > (TYPES_reinterpret_signed1): New > macro. > (TYPES_reinterpret_unsigned1): Likewise. > (TYPES_reinterpret_integer): Likewise. > (TYPES_reinterpret_integer1): Likewise. > (TYPES_reinterpret_float1): Likewise. > (TYPES_reinterpret_float): Likewise. > (reinterpret_integer): New. > (reinterpret_float): New. > (handle_arm_mve_h): Register builtins. > * config/arm/arm_mve.h (vreinterpretq_s16): > Remove. > (vreinterpretq_s32): Likewise. > (vreinterpretq_s64): Likewise. > (vreinterpretq_s8): Likewise. > (vreinterpretq_u16): Likewise. > (vreinterpretq_u32): Likewise. > (vreinterpretq_u64): Likewise. > (vreinterpretq_u8): Likewise. > (vreinterpretq_f16): Likewise. > (vreinterpretq_f32): Likewise. > (vreinterpretq_s16_s32): Likewise. > (vreinterpretq_s16_s64): Likewise. > (vreinterpretq_s16_s8): Likewise. > (vreinterpretq_s16_u16): Likewise. > (vreinterpretq_s16_u32): Likewise. > (vreinterpretq_s16_u64): Likewise. > (vreinterpretq_s16_u8): Likewise. > (vreinterpretq_s32_s16): Likewise. > (vreinterpretq_s32_s64): Likewise. > (vreinterpretq_s32_s8): Likewise. > (vreinterpretq_s32_u16): Likewise. > (vreinterpretq_s32_u32): Likewise. > (vreinterpretq_s32_u64): Likewise. > (vreinterpretq_s32_u8): Likewise. > (vreinterpretq_s64_s16): Likewise. > (vreinterpretq_s64_s32): Likewise. > (vreinterpretq_s64_s8): Likewise. > (vreinterpretq_s64_u16): Likewise. > (vreinterpretq_s64_u32): Likewise. > (vreinterpretq_s64_u64): Likewise. > (vreinterpretq_s64_u8): Likewise. > (vreinterpretq_s8_s16): Likewise. > (vreinterpretq_s8_s32): Likewise. > (vreinterpretq_s8_s64): Likewise. > (vreinterpretq_s8_u16): Likewise. > (vreinterpretq_s8_u32): Likewise. > (vreinterpretq_s8_u64): Likewise. > (vreinterpretq_s8_u8): Likewise. > (vreinterpretq_u16_s16): Likewise. > (vreinterpretq_u16_s32): Likewise. > (vreinterpretq_u16_s64): Likewise. > (vreinterpretq_u16_s8): Likewise. > (vreinterpretq_u16_u32): Likewise. > (vreinterpretq_u16_u64): Likewise. > (vreinterpretq_u16_u8): Likewise. > (vreinterpretq_u32_s16): Likewise. > (vreinterpretq_u32_s32): Likewise. > (vreinterpretq_u32_s64): Likewise. > (vreinterpretq_u32_s8): Likewise. > (vreinterpretq_u32_u16): Likewise. > (vreinterpretq_u32_u64): Likewise. > (vreinterpretq_u32_u8): Likewise. > (vreinterpretq_u64_s16): Likewise. > (vreinterpretq_u64_s32): Likewise. > (vreinterpretq_u64_s64): Likewise. > (vreinterpretq_u64_s8): Likewise. > (vreinterpretq_u64_u16): Likewise. > (vreinterpretq_u64_u32): Likewise. > (vreinterpretq_u64_u8): Likewise. > (vreinterpretq_u8_s16): Likewise. > (vreinterpretq_u8_s32): Likewise. > (vreinterpretq_u8_s64): Likewise. > (vreinterpretq_u8_s8): Likewise. > (vreinterpretq_u8_u16): Likewise. > (vreinterpretq_u8_u32): Likewise. > (vreinterpretq_u8_u64): Likewise. > (vreinterpretq_s32_f16): Likewise. > (vreinterpretq_s32_f32): Likewise. > (vreinterpretq_u16_f16): Likewise. > (vreinterpretq_u16_f32): Likewise. > (vreinterpretq_u32_f16): Likewise. > (vreinterpretq_u32_f32): Likewise. > (vreinterpretq_u64_f16): Likewise. > (vreinterpretq_u64_f32): Likewise. > (vreinterpretq_u8_f16): Likewise. > (vreinterpretq_u8_f32): Likewise. > (vreinterpretq_f16_f32): Likewise. > (vreinterpretq_f16_s16): Likewise. > (vreinterpretq_f16_s32): Likewise. > (vreinterpretq_f16_s64): Likewise. > (vreinterpretq_f16_s8): Likewise. > (vreinterpretq_f16_u16): Likewise. > (vreinterpretq_f16_u32): Likewise. > (vreinterpretq_f16_u64): Likewise. > (vreinterpretq_f16_u8): Likewise. > (vreinterpretq_f32_f16): Likewise. > (vreinterpretq_f32_s16): Likewise. > (vreinterpretq_f32_s32): Likewise. > (vreinterpretq_f32_s64): Likewise. > (vreinterpretq_f32_s8): Likewise. > (vreinterpretq_f32_u16): Likewise. > (vreinterpretq_f32_u32): Likewise. > (vreinterpretq_f32_u64): Likewise. > (vreinterpretq_f32_u8): Likewise. > (vreinterpretq_s16_f16): Likewise. > (vreinterpretq_s16_f32): Likewise. > (vreinterpretq_s64_f16): Likewise. > (vreinterpretq_s64_f32): Likewise. > (vreinterpretq_s8_f16): Likewise. > (vreinterpretq_s8_f32): Likewise. > (__arm_vreinterpretq_f16): Likewise. > (__arm_vreinterpretq_f32): Likewise. > (__arm_vreinterpretq_s16): Likewise. > (__arm_vreinterpretq_s32): Likewise. > (__arm_vreinterpretq_s64): Likewise. > (__arm_vreinterpretq_s8): Likewise. > (__arm_vreinterpretq_u16): Likewise. > (__arm_vreinterpretq_u32): Likewise. > (__arm_vreinterpretq_u64): Likewise. > (__arm_vreinterpretq_u8): Likewise. > * config/arm/arm_mve_types.h > (__arm_vreinterpretq_s16_s32): > Remove. > (__arm_vreinterpretq_s16_s64): Likewise. > (__arm_vreinterpretq_s16_s8): Likewise. > (__arm_vreinterpretq_s16_u16): Likewise. > (__arm_vreinterpretq_s16_u32): Likewise. > (__arm_vreinterpretq_s16_u64): Likewise. > (__arm_vreinterpretq_s16_u8): Likewise. > (__arm_vreinterpretq_s32_s16): Likewise. > (__arm_vreinterpretq_s32_s64): Likewise. > (__arm_vreinterpretq_s32_s8): Likewise. > (__arm_vreinterpretq_s32_u16): Likewise. > (__arm_vreinterpretq_s32_u32): Likewise. > (__arm_vreinterpretq_s32_u64): Likewise. > (__arm_vreinterpretq_s32_u8): Likewise. > (__arm_vreinterpretq_s64_s16): Likewise. > (__arm_vreinterpretq_s64_s32): Likewise. > (__arm_vreinterpretq_s64_s8): Likewise. > (__arm_vreinterpretq_s64_u16): Likewise. > (__arm_vreinterpretq_s64_u32): Likewise. > (__arm_vreinterpretq_s64_u64): Likewise. > (__arm_vreinterpretq_s64_u8): Likewise. > (__arm_vreinterpretq_s8_s16): Likewise. > (__arm_vreinterpretq_s8_s32): Likewise. > (__arm_vreinterpretq_s8_s64): Likewise. > (__arm_vreinterpretq_s8_u16): Likewise. > (__arm_vreinterpretq_s8_u32): Likewise. > (__arm_vreinterpretq_s8_u64): Likewise. > (__arm_vreinterpretq_s8_u8): Likewise. > (__arm_vreinterpretq_u16_s16): Likewise. > (__arm_vreinterpretq_u16_s32): Likewise. > (__arm_vreinterpretq_u16_s64): Likewise. > (__arm_vreinterpretq_u16_s8): Likewise. > (__arm_vreinterpretq_u16_u32): Likewise. > (__arm_vreinterpretq_u16_u64): Likewise. > (__arm_vreinterpretq_u16_u8): Likewise. > (__arm_vreinterpretq_u32_s16): Likewise. > (__arm_vreinterpretq_u32_s32): Likewise. > (__arm_vreinterpretq_u32_s64): Likewise. > (__arm_vreinterpretq_u32_s8): Likewise. > (__arm_vreinterpretq_u32_u16): Likewise. > (__arm_vreinterpretq_u32_u64): Likewise. > (__arm_vreinterpretq_u32_u8): Likewise. > (__arm_vreinterpretq_u64_s16): Likewise. > (__arm_vreinterpretq_u64_s32): Likewise. > (__arm_vreinterpretq_u64_s64): Likewise. > (__arm_vreinterpretq_u64_s8): Likewise. > (__arm_vreinterpretq_u64_u16): Likewise. > (__arm_vreinterpretq_u64_u32): Likewise. > (__arm_vreinterpretq_u64_u8): Likewise. > (__arm_vreinterpretq_u8_s16): Likewise. > (__arm_vreinterpretq_u8_s32): Likewise. > (__arm_vreinterpretq_u8_s64): Likewise. > (__arm_vreinterpretq_u8_s8): Likewise. > (__arm_vreinterpretq_u8_u16): Likewise. > (__arm_vreinterpretq_u8_u32): Likewise. > (__arm_vreinterpretq_u8_u64): Likewise. > (__arm_vreinterpretq_s32_f16): Likewise. > (__arm_vreinterpretq_s32_f32): Likewise. > (__arm_vreinterpretq_s16_f16): Likewise. > (__arm_vreinterpretq_s16_f32): Likewise. > (__arm_vreinterpretq_s64_f16): Likewise. > (__arm_vreinterpretq_s64_f32): Likewise. > (__arm_vreinterpretq_s8_f16): Likewise. > (__arm_vreinterpretq_s8_f32): Likewise. > (__arm_vreinterpretq_u16_f16): Likewise. > (__arm_vreinterpretq_u16_f32): Likewise. > (__arm_vreinterpretq_u32_f16): Likewise. > (__arm_vreinterpretq_u32_f32): Likewise. > (__arm_vreinterpretq_u64_f16): Likewise. > (__arm_vreinterpretq_u64_f32): Likewise. > (__arm_vreinterpretq_u8_f16): Likewise. > (__arm_vreinterpretq_u8_f32): Likewise. > (__arm_vreinterpretq_f16_f32): Likewise. > (__arm_vreinterpretq_f16_s16): Likewise. > (__arm_vreinterpretq_f16_s32): Likewise. > (__arm_vreinterpretq_f16_s64): Likewise. > (__arm_vreinterpretq_f16_s8): Likewise. > (__arm_vreinterpretq_f16_u16): Likewise. > (__arm_vreinterpretq_f16_u32): Likewise. > (__arm_vreinterpretq_f16_u64): Likewise. > (__arm_vreinterpretq_f16_u8): Likewise. > (__arm_vreinterpretq_f32_f16): Likewise. > (__arm_vreinterpretq_f32_s16): Likewise. > (__arm_vreinterpretq_f32_s32): Likewise. > (__arm_vreinterpretq_f32_s64): Likewise. > (__arm_vreinterpretq_f32_s8): Likewise. > (__arm_vreinterpretq_f32_u16): Likewise. > (__arm_vreinterpretq_f32_u32): Likewise. > (__arm_vreinterpretq_f32_u64): Likewise. > (__arm_vreinterpretq_f32_u8): Likewise. > (__arm_vreinterpretq_s16): Likewise. > (__arm_vreinterpretq_s32): Likewise. > (__arm_vreinterpretq_s64): Likewise. > (__arm_vreinterpretq_s8): Likewise. > (__arm_vreinterpretq_u16): Likewise. > (__arm_vreinterpretq_u32): Likewise. > (__arm_vreinterpretq_u64): Likewise. > (__arm_vreinterpretq_u8): Likewise. > (__arm_vreinterpretq_f16): Likewise. > (__arm_vreinterpretq_f32): Likewise. > * config/arm/mve.md > (@arm_mve_reinterpret<mode>): New > pattern. > * config/arm/unspecs.md: (REINTERPRET): New > unspec. > > gcc/testsuite/ > * g++.target/arm/mve.exp: Add general-c++ and > general directories. > * g++.target/arm/mve/general-c++/nomve_fp_1.c: > New test. > * g++.target/arm/mve/general-c++/vreinterpretq_1.C: > New test. > * gcc.target/arm/mve/general-c/nomve_fp_1.c: New > test. > * gcc.target/arm/mve/general-c/vreinterpretq_1.c: > New test. > --- > gcc/config/arm/arm-mve-builtins-base.cc | 29 + > gcc/config/arm/arm-mve-builtins-base.def | 2 + > gcc/config/arm/arm-mve-builtins-base.h | 2 + > gcc/config/arm/arm-mve-builtins-shapes.cc | 28 + > gcc/config/arm/arm-mve-builtins-shapes.h | 8 + > gcc/config/arm/arm-mve-builtins.cc | 60 + > gcc/config/arm/arm_mve.h | 300 ---- > gcc/config/arm/arm_mve_types.h | 1365 > +------------- > --- > gcc/config/arm/mve.md | 18 + > gcc/config/arm/unspecs.md | 1 + > gcc/testsuite/g++.target/arm/mve.exp | 8 +- > .../arm/mve/general-c++/nomve_fp_1.c | 15 + > .../arm/mve/general-c++/vreinterpretq_1.C | 25 + > .../gcc.target/arm/mve/general-c/nomve_fp_1.c | 15 + > .../arm/mve/general-c/vreinterpretq_1.c | 25 + > 15 files changed, 286 insertions(+), 1615 deletions(-) > create mode 100644 > gcc/testsuite/g++.target/arm/mve/general- > c++/nomve_fp_1.c > create mode 100644 > gcc/testsuite/g++.target/arm/mve/general- > c++/vreinterpretq_1.C > create mode 100644 > gcc/testsuite/gcc.target/arm/mve/general- > c/nomve_fp_1.c > create mode 100644 > gcc/testsuite/gcc.target/arm/mve/general- > c/vreinterpretq_1.c > > diff --git a/gcc/config/arm/arm-mve-builtins-base.cc > b/gcc/config/arm/arm- > mve-builtins-base.cc > index e9f285faf2b..ad8d500afc6 100644 > --- a/gcc/config/arm/arm-mve-builtins-base.cc > +++ b/gcc/config/arm/arm-mve-builtins-base.cc > @@ -38,8 +38,37 @@ using namespace arm_mve; > > namespace { > > +/* Implements vreinterpretq_* intrinsics. */ > +class vreinterpretq_impl : public quiet<function_base> > +{ > + gimple * > + fold (gimple_folder &f) const override > + { > + /* Punt to rtl if the effect of the reinterpret on > registers > does not > + conform to GCC's endianness model. */ > + if (!targetm.can_change_mode_class (f.vector_mode (0), > + f.vector_mode (1), > VFP_REGS)) > + return NULL; > + > > > So we punt to an RTL pattern here if we cannot change mode class... > > [snip] > > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md > index 35eab6c94bf..ab688396f97 100644 > --- a/gcc/config/arm/mve.md > +++ b/gcc/config/arm/mve.md > @@ -10561,3 +10561,21 @@ (define_expand > "vcond_mask_<mode><MVE_vpred>" > } > DONE; > }) > + > +;; Reinterpret operand 1 in operand 0's mode, without > changing its contents. > +(define_expand "@arm_mve_reinterpret<mode>" > + [(set (match_operand:MVE_vecs 0 "register_operand") > + (unspec:MVE_vecs > + [(match_operand 1 "arm_any_register_operand")] > + REINTERPRET))] > + "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE > (<MODE>mode)) > + || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE > (<MODE>mode))" > + { > + machine_mode src_mode = GET_MODE (operands[1]); > + if (targetm.can_change_mode_class (<MODE>mode, > src_mode, > VFP_REGS)) > + { > + emit_move_insn (operands[0], gen_lowpart > (<MODE>mode, > operands[1])); > + DONE; > + } > + } > +) > > > ... But we still check can_change_mode_class in this pattern and if it's > not true we emit the new REINTERPRET unspec > without a corresponding define_insn pattern. Won't that ICE? Would > this case occur on big-endian targets? > > > > > Looks like you are right. However, arm_mve.h is protected by: > > #if __ARM_BIG_ENDIAN > > #error "MVE intrinsics are not supported in Big-Endian mode." > > > > > Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE. > > > > > In fact, this pattern and vreinterpretq_impl above are quite similar to the > aarch64 implementation. > > I tried with a sample > > svint16_t foo(svint8_t value1) > { > return svreinterpret_s16_s8(value1); > } > and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK, > although > aarch64_can_change_mode_class() has: > if (BYTES_BIG_ENDIAN) > ... > if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE > (to)) > return false; > so it should have a similar problem? I', not sure why it doesn't ICE?
I believe that's because there's a pattern in aarch64-sve.md that converts everything into a simple set with the right modes forced in. ;; A pattern for handling type punning on big-endian targets. We use a ;; special predicate for operand 1 to reduce the number of patterns. (define_insn_and_split "*aarch64_sve_reinterpret<mode>" [(set (match_operand:SVE_ALL 0 "register_operand" "=w") (unspec:SVE_ALL [(match_operand 1 "aarch64_any_register_operand" "w")] UNSPEC_REINTERPRET))] "TARGET_SVE" "#" "&& reload_completed" [(set (match_dup 0) (match_dup 1))] { operands[1] = aarch64_replace_reg_mode (operands[1], <MODE>mode); } ) I guess since we don't claim to support big-endian MVE for now we probably don't need to handle it, but I wonder whether we should instead be asserting that targetm.can_change_mode_class is true in the folding code and adding a comment that it for future big-endian support it should be handled properly in the .md file as on aarch64? Thanks, Kyrill > Thanks, > Christophe > > > > Thanks, > Kyrill > > > diff --git a/gcc/config/arm/unspecs.md > b/gcc/config/arm/unspecs.md > index 84384ee798d..dccda283573 100644 > --- a/gcc/config/arm/unspecs.md > +++ b/gcc/config/arm/unspecs.md > @@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [ > SQRSHRL_64 > SQRSHRL_48 > VSHLCQ_M_ > + REINTERPRET > ]) > >