> -----Original Message-----
> From: Christophe Lyon <christophe.l...@arm.com>
> Sent: Tuesday, May 2, 2023 3:05 PM
> To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-patches@gcc.gnu.org;
> Richard Earnshaw <richard.earns...@arm.com>; Richard Sandiford
> <richard.sandif...@arm.com>
> Subject: Re: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq
> 
> 
> 
> 
> On 5/2/23 12:26, Kyrylo Tkachov wrote:
> 
> 
>       Hi Christophe,
> 
> 
>               -----Original Message-----
>               From: Christophe Lyon <christophe.l...@arm.com>
> <mailto:christophe.l...@arm.com>
>               Sent: Tuesday, April 18, 2023 2:46 PM
>               To: gcc-patches@gcc.gnu.org <mailto:gcc-
> patc...@gcc.gnu.org> ; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> <mailto:kyrylo.tkac...@arm.com> ;
>               Richard Earnshaw <richard.earns...@arm.com>
> <mailto:richard.earns...@arm.com> ; Richard Sandiford
>               <richard.sandif...@arm.com>
> <mailto:richard.sandif...@arm.com>
>               Cc: Christophe Lyon <christophe.l...@arm.com>
> <mailto:christophe.l...@arm.com>
>               Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework
> vreinterpretq
> 
>               This patch implements vreinterpretq using the new MVE
> intrinsics
>               framework.
> 
>               The old definitions for vreinterpretq are removed as a
> consequence.
> 
>               2022-09-08  Murray Steele  <murray.ste...@arm.com>
> <mailto:murray.ste...@arm.com>
>                           Christophe Lyon  <christophe.l...@arm.com>
> <mailto:christophe.l...@arm.com>
> 
>                       gcc/
>                       * config/arm/arm-mve-builtins-base.cc
> (vreinterpretq_impl): New
>               class.
>                       * config/arm/arm-mve-builtins-base.def: Define
> vreinterpretq.
>                       * config/arm/arm-mve-builtins-base.h
> (vreinterpretq): New
>               declaration.
>                       * config/arm/arm-mve-builtins-shapes.cc
> (parse_element_type): New
>               function.
>                       (parse_type): Likewise.
>                       (parse_signature): Likewise.
>                       (build_one): Likewise.
>                       (build_all): Likewise.
>                       (overloaded_base): New struct.
>                       (unary_convert_def): Likewise.
>                       * config/arm/arm-mve-builtins-shapes.h
> (unary_convert): Declare.
>                       * config/arm/arm-mve-builtins.cc
> (TYPES_reinterpret_signed1): New
>                       macro.
>                       (TYPES_reinterpret_unsigned1): Likewise.
>                       (TYPES_reinterpret_integer): Likewise.
>                       (TYPES_reinterpret_integer1): Likewise.
>                       (TYPES_reinterpret_float1): Likewise.
>                       (TYPES_reinterpret_float): Likewise.
>                       (reinterpret_integer): New.
>                       (reinterpret_float): New.
>                       (handle_arm_mve_h): Register builtins.
>                       * config/arm/arm_mve.h (vreinterpretq_s16):
> Remove.
>                       (vreinterpretq_s32): Likewise.
>                       (vreinterpretq_s64): Likewise.
>                       (vreinterpretq_s8): Likewise.
>                       (vreinterpretq_u16): Likewise.
>                       (vreinterpretq_u32): Likewise.
>                       (vreinterpretq_u64): Likewise.
>                       (vreinterpretq_u8): Likewise.
>                       (vreinterpretq_f16): Likewise.
>                       (vreinterpretq_f32): Likewise.
>                       (vreinterpretq_s16_s32): Likewise.
>                       (vreinterpretq_s16_s64): Likewise.
>                       (vreinterpretq_s16_s8): Likewise.
>                       (vreinterpretq_s16_u16): Likewise.
>                       (vreinterpretq_s16_u32): Likewise.
>                       (vreinterpretq_s16_u64): Likewise.
>                       (vreinterpretq_s16_u8): Likewise.
>                       (vreinterpretq_s32_s16): Likewise.
>                       (vreinterpretq_s32_s64): Likewise.
>                       (vreinterpretq_s32_s8): Likewise.
>                       (vreinterpretq_s32_u16): Likewise.
>                       (vreinterpretq_s32_u32): Likewise.
>                       (vreinterpretq_s32_u64): Likewise.
>                       (vreinterpretq_s32_u8): Likewise.
>                       (vreinterpretq_s64_s16): Likewise.
>                       (vreinterpretq_s64_s32): Likewise.
>                       (vreinterpretq_s64_s8): Likewise.
>                       (vreinterpretq_s64_u16): Likewise.
>                       (vreinterpretq_s64_u32): Likewise.
>                       (vreinterpretq_s64_u64): Likewise.
>                       (vreinterpretq_s64_u8): Likewise.
>                       (vreinterpretq_s8_s16): Likewise.
>                       (vreinterpretq_s8_s32): Likewise.
>                       (vreinterpretq_s8_s64): Likewise.
>                       (vreinterpretq_s8_u16): Likewise.
>                       (vreinterpretq_s8_u32): Likewise.
>                       (vreinterpretq_s8_u64): Likewise.
>                       (vreinterpretq_s8_u8): Likewise.
>                       (vreinterpretq_u16_s16): Likewise.
>                       (vreinterpretq_u16_s32): Likewise.
>                       (vreinterpretq_u16_s64): Likewise.
>                       (vreinterpretq_u16_s8): Likewise.
>                       (vreinterpretq_u16_u32): Likewise.
>                       (vreinterpretq_u16_u64): Likewise.
>                       (vreinterpretq_u16_u8): Likewise.
>                       (vreinterpretq_u32_s16): Likewise.
>                       (vreinterpretq_u32_s32): Likewise.
>                       (vreinterpretq_u32_s64): Likewise.
>                       (vreinterpretq_u32_s8): Likewise.
>                       (vreinterpretq_u32_u16): Likewise.
>                       (vreinterpretq_u32_u64): Likewise.
>                       (vreinterpretq_u32_u8): Likewise.
>                       (vreinterpretq_u64_s16): Likewise.
>                       (vreinterpretq_u64_s32): Likewise.
>                       (vreinterpretq_u64_s64): Likewise.
>                       (vreinterpretq_u64_s8): Likewise.
>                       (vreinterpretq_u64_u16): Likewise.
>                       (vreinterpretq_u64_u32): Likewise.
>                       (vreinterpretq_u64_u8): Likewise.
>                       (vreinterpretq_u8_s16): Likewise.
>                       (vreinterpretq_u8_s32): Likewise.
>                       (vreinterpretq_u8_s64): Likewise.
>                       (vreinterpretq_u8_s8): Likewise.
>                       (vreinterpretq_u8_u16): Likewise.
>                       (vreinterpretq_u8_u32): Likewise.
>                       (vreinterpretq_u8_u64): Likewise.
>                       (vreinterpretq_s32_f16): Likewise.
>                       (vreinterpretq_s32_f32): Likewise.
>                       (vreinterpretq_u16_f16): Likewise.
>                       (vreinterpretq_u16_f32): Likewise.
>                       (vreinterpretq_u32_f16): Likewise.
>                       (vreinterpretq_u32_f32): Likewise.
>                       (vreinterpretq_u64_f16): Likewise.
>                       (vreinterpretq_u64_f32): Likewise.
>                       (vreinterpretq_u8_f16): Likewise.
>                       (vreinterpretq_u8_f32): Likewise.
>                       (vreinterpretq_f16_f32): Likewise.
>                       (vreinterpretq_f16_s16): Likewise.
>                       (vreinterpretq_f16_s32): Likewise.
>                       (vreinterpretq_f16_s64): Likewise.
>                       (vreinterpretq_f16_s8): Likewise.
>                       (vreinterpretq_f16_u16): Likewise.
>                       (vreinterpretq_f16_u32): Likewise.
>                       (vreinterpretq_f16_u64): Likewise.
>                       (vreinterpretq_f16_u8): Likewise.
>                       (vreinterpretq_f32_f16): Likewise.
>                       (vreinterpretq_f32_s16): Likewise.
>                       (vreinterpretq_f32_s32): Likewise.
>                       (vreinterpretq_f32_s64): Likewise.
>                       (vreinterpretq_f32_s8): Likewise.
>                       (vreinterpretq_f32_u16): Likewise.
>                       (vreinterpretq_f32_u32): Likewise.
>                       (vreinterpretq_f32_u64): Likewise.
>                       (vreinterpretq_f32_u8): Likewise.
>                       (vreinterpretq_s16_f16): Likewise.
>                       (vreinterpretq_s16_f32): Likewise.
>                       (vreinterpretq_s64_f16): Likewise.
>                       (vreinterpretq_s64_f32): Likewise.
>                       (vreinterpretq_s8_f16): Likewise.
>                       (vreinterpretq_s8_f32): Likewise.
>                       (__arm_vreinterpretq_f16): Likewise.
>                       (__arm_vreinterpretq_f32): Likewise.
>                       (__arm_vreinterpretq_s16): Likewise.
>                       (__arm_vreinterpretq_s32): Likewise.
>                       (__arm_vreinterpretq_s64): Likewise.
>                       (__arm_vreinterpretq_s8): Likewise.
>                       (__arm_vreinterpretq_u16): Likewise.
>                       (__arm_vreinterpretq_u32): Likewise.
>                       (__arm_vreinterpretq_u64): Likewise.
>                       (__arm_vreinterpretq_u8): Likewise.
>                       * config/arm/arm_mve_types.h
> (__arm_vreinterpretq_s16_s32):
>               Remove.
>                       (__arm_vreinterpretq_s16_s64): Likewise.
>                       (__arm_vreinterpretq_s16_s8): Likewise.
>                       (__arm_vreinterpretq_s16_u16): Likewise.
>                       (__arm_vreinterpretq_s16_u32): Likewise.
>                       (__arm_vreinterpretq_s16_u64): Likewise.
>                       (__arm_vreinterpretq_s16_u8): Likewise.
>                       (__arm_vreinterpretq_s32_s16): Likewise.
>                       (__arm_vreinterpretq_s32_s64): Likewise.
>                       (__arm_vreinterpretq_s32_s8): Likewise.
>                       (__arm_vreinterpretq_s32_u16): Likewise.
>                       (__arm_vreinterpretq_s32_u32): Likewise.
>                       (__arm_vreinterpretq_s32_u64): Likewise.
>                       (__arm_vreinterpretq_s32_u8): Likewise.
>                       (__arm_vreinterpretq_s64_s16): Likewise.
>                       (__arm_vreinterpretq_s64_s32): Likewise.
>                       (__arm_vreinterpretq_s64_s8): Likewise.
>                       (__arm_vreinterpretq_s64_u16): Likewise.
>                       (__arm_vreinterpretq_s64_u32): Likewise.
>                       (__arm_vreinterpretq_s64_u64): Likewise.
>                       (__arm_vreinterpretq_s64_u8): Likewise.
>                       (__arm_vreinterpretq_s8_s16): Likewise.
>                       (__arm_vreinterpretq_s8_s32): Likewise.
>                       (__arm_vreinterpretq_s8_s64): Likewise.
>                       (__arm_vreinterpretq_s8_u16): Likewise.
>                       (__arm_vreinterpretq_s8_u32): Likewise.
>                       (__arm_vreinterpretq_s8_u64): Likewise.
>                       (__arm_vreinterpretq_s8_u8): Likewise.
>                       (__arm_vreinterpretq_u16_s16): Likewise.
>                       (__arm_vreinterpretq_u16_s32): Likewise.
>                       (__arm_vreinterpretq_u16_s64): Likewise.
>                       (__arm_vreinterpretq_u16_s8): Likewise.
>                       (__arm_vreinterpretq_u16_u32): Likewise.
>                       (__arm_vreinterpretq_u16_u64): Likewise.
>                       (__arm_vreinterpretq_u16_u8): Likewise.
>                       (__arm_vreinterpretq_u32_s16): Likewise.
>                       (__arm_vreinterpretq_u32_s32): Likewise.
>                       (__arm_vreinterpretq_u32_s64): Likewise.
>                       (__arm_vreinterpretq_u32_s8): Likewise.
>                       (__arm_vreinterpretq_u32_u16): Likewise.
>                       (__arm_vreinterpretq_u32_u64): Likewise.
>                       (__arm_vreinterpretq_u32_u8): Likewise.
>                       (__arm_vreinterpretq_u64_s16): Likewise.
>                       (__arm_vreinterpretq_u64_s32): Likewise.
>                       (__arm_vreinterpretq_u64_s64): Likewise.
>                       (__arm_vreinterpretq_u64_s8): Likewise.
>                       (__arm_vreinterpretq_u64_u16): Likewise.
>                       (__arm_vreinterpretq_u64_u32): Likewise.
>                       (__arm_vreinterpretq_u64_u8): Likewise.
>                       (__arm_vreinterpretq_u8_s16): Likewise.
>                       (__arm_vreinterpretq_u8_s32): Likewise.
>                       (__arm_vreinterpretq_u8_s64): Likewise.
>                       (__arm_vreinterpretq_u8_s8): Likewise.
>                       (__arm_vreinterpretq_u8_u16): Likewise.
>                       (__arm_vreinterpretq_u8_u32): Likewise.
>                       (__arm_vreinterpretq_u8_u64): Likewise.
>                       (__arm_vreinterpretq_s32_f16): Likewise.
>                       (__arm_vreinterpretq_s32_f32): Likewise.
>                       (__arm_vreinterpretq_s16_f16): Likewise.
>                       (__arm_vreinterpretq_s16_f32): Likewise.
>                       (__arm_vreinterpretq_s64_f16): Likewise.
>                       (__arm_vreinterpretq_s64_f32): Likewise.
>                       (__arm_vreinterpretq_s8_f16): Likewise.
>                       (__arm_vreinterpretq_s8_f32): Likewise.
>                       (__arm_vreinterpretq_u16_f16): Likewise.
>                       (__arm_vreinterpretq_u16_f32): Likewise.
>                       (__arm_vreinterpretq_u32_f16): Likewise.
>                       (__arm_vreinterpretq_u32_f32): Likewise.
>                       (__arm_vreinterpretq_u64_f16): Likewise.
>                       (__arm_vreinterpretq_u64_f32): Likewise.
>                       (__arm_vreinterpretq_u8_f16): Likewise.
>                       (__arm_vreinterpretq_u8_f32): Likewise.
>                       (__arm_vreinterpretq_f16_f32): Likewise.
>                       (__arm_vreinterpretq_f16_s16): Likewise.
>                       (__arm_vreinterpretq_f16_s32): Likewise.
>                       (__arm_vreinterpretq_f16_s64): Likewise.
>                       (__arm_vreinterpretq_f16_s8): Likewise.
>                       (__arm_vreinterpretq_f16_u16): Likewise.
>                       (__arm_vreinterpretq_f16_u32): Likewise.
>                       (__arm_vreinterpretq_f16_u64): Likewise.
>                       (__arm_vreinterpretq_f16_u8): Likewise.
>                       (__arm_vreinterpretq_f32_f16): Likewise.
>                       (__arm_vreinterpretq_f32_s16): Likewise.
>                       (__arm_vreinterpretq_f32_s32): Likewise.
>                       (__arm_vreinterpretq_f32_s64): Likewise.
>                       (__arm_vreinterpretq_f32_s8): Likewise.
>                       (__arm_vreinterpretq_f32_u16): Likewise.
>                       (__arm_vreinterpretq_f32_u32): Likewise.
>                       (__arm_vreinterpretq_f32_u64): Likewise.
>                       (__arm_vreinterpretq_f32_u8): Likewise.
>                       (__arm_vreinterpretq_s16): Likewise.
>                       (__arm_vreinterpretq_s32): Likewise.
>                       (__arm_vreinterpretq_s64): Likewise.
>                       (__arm_vreinterpretq_s8): Likewise.
>                       (__arm_vreinterpretq_u16): Likewise.
>                       (__arm_vreinterpretq_u32): Likewise.
>                       (__arm_vreinterpretq_u64): Likewise.
>                       (__arm_vreinterpretq_u8): Likewise.
>                       (__arm_vreinterpretq_f16): Likewise.
>                       (__arm_vreinterpretq_f32): Likewise.
>                       * config/arm/mve.md
> (@arm_mve_reinterpret<mode>): New
>               pattern.
>                       * config/arm/unspecs.md: (REINTERPRET): New
> unspec.
> 
>                       gcc/testsuite/
>                       * g++.target/arm/mve.exp: Add general-c++ and
> general directories.
>                       * g++.target/arm/mve/general-c++/nomve_fp_1.c:
> New test.
>                       * g++.target/arm/mve/general-c++/vreinterpretq_1.C:
> New test.
>                       * gcc.target/arm/mve/general-c/nomve_fp_1.c: New
> test.
>                       * gcc.target/arm/mve/general-c/vreinterpretq_1.c:
> New test.
>               ---
>                gcc/config/arm/arm-mve-builtins-base.cc       |   29 +
>                gcc/config/arm/arm-mve-builtins-base.def      |    2 +
>                gcc/config/arm/arm-mve-builtins-base.h        |    2 +
>                gcc/config/arm/arm-mve-builtins-shapes.cc     |   28 +
>                gcc/config/arm/arm-mve-builtins-shapes.h      |    8 +
>                gcc/config/arm/arm-mve-builtins.cc            |   60 +
>                gcc/config/arm/arm_mve.h                      |  300 ----
>                gcc/config/arm/arm_mve_types.h                | 1365 
> +-------------
> ---
>                gcc/config/arm/mve.md                         |   18 +
>                gcc/config/arm/unspecs.md                     |    1 +
>                gcc/testsuite/g++.target/arm/mve.exp          |    8 +-
>                .../arm/mve/general-c++/nomve_fp_1.c          |   15 +
>                .../arm/mve/general-c++/vreinterpretq_1.C     |   25 +
>                .../gcc.target/arm/mve/general-c/nomve_fp_1.c |   15 +
>                .../arm/mve/general-c/vreinterpretq_1.c       |   25 +
>                15 files changed, 286 insertions(+), 1615 deletions(-)
>                create mode 100644
> gcc/testsuite/g++.target/arm/mve/general-
>               c++/nomve_fp_1.c
>                create mode 100644
> gcc/testsuite/g++.target/arm/mve/general-
>               c++/vreinterpretq_1.C
>                create mode 100644
> gcc/testsuite/gcc.target/arm/mve/general-
>               c/nomve_fp_1.c
>                create mode 100644
> gcc/testsuite/gcc.target/arm/mve/general-
>               c/vreinterpretq_1.c
> 
>               diff --git a/gcc/config/arm/arm-mve-builtins-base.cc
> b/gcc/config/arm/arm-
>               mve-builtins-base.cc
>               index e9f285faf2b..ad8d500afc6 100644
>               --- a/gcc/config/arm/arm-mve-builtins-base.cc
>               +++ b/gcc/config/arm/arm-mve-builtins-base.cc
>               @@ -38,8 +38,37 @@ using namespace arm_mve;
> 
>                namespace {
> 
>               +/* Implements vreinterpretq_* intrinsics.  */
>               +class vreinterpretq_impl : public quiet<function_base>
>               +{
>               +  gimple *
>               +  fold (gimple_folder &f) const override
>               +  {
>               +    /* Punt to rtl if the effect of the reinterpret on 
> registers
> does not
>               +       conform to GCC's endianness model.  */
>               +    if (!targetm.can_change_mode_class (f.vector_mode (0),
>               +                                       f.vector_mode (1),
> VFP_REGS))
>               +      return NULL;
>               +
> 
> 
>       So we punt to an RTL pattern here if we cannot change mode class...
> 
>       [snip]
> 
> 
>               diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
>               index 35eab6c94bf..ab688396f97 100644
>               --- a/gcc/config/arm/mve.md
>               +++ b/gcc/config/arm/mve.md
>               @@ -10561,3 +10561,21 @@ (define_expand
>               "vcond_mask_<mode><MVE_vpred>"
>                    }
>                  DONE;
>                })
>               +
>               +;; Reinterpret operand 1 in operand 0's mode, without
> changing its contents.
>               +(define_expand "@arm_mve_reinterpret<mode>"
>               +  [(set (match_operand:MVE_vecs 0 "register_operand")
>               +       (unspec:MVE_vecs
>               +         [(match_operand 1 "arm_any_register_operand")]
>               +         REINTERPRET))]
>               +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE
> (<MODE>mode))
>               +    || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
>               (<MODE>mode))"
>               +  {
>               +    machine_mode src_mode = GET_MODE (operands[1]);
>               +    if (targetm.can_change_mode_class (<MODE>mode,
> src_mode,
>               VFP_REGS))
>               +      {
>               +       emit_move_insn (operands[0], gen_lowpart
> (<MODE>mode,
>               operands[1]));
>               +       DONE;
>               +      }
>               +  }
>               +)
> 
> 
>       ... But we still check can_change_mode_class in this pattern and if it's
> not true we emit the new REINTERPRET unspec
>       without a corresponding define_insn pattern. Won't that ICE? Would
> this case occur on big-endian targets?
> 
> 
> 
> 
> Looks like you are right. However, arm_mve.h is protected by:
> 
> #if __ARM_BIG_ENDIAN
> 
> #error "MVE intrinsics are not supported in Big-Endian mode."
> 
> 
> 
> 
> Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE.
> 
> 
> 
> 
> In fact, this pattern and vreinterpretq_impl above are quite similar to the
> aarch64 implementation.
> 
> I tried with a sample
> 
> svint16_t foo(svint8_t value1)
> {
> return svreinterpret_s16_s8(value1);
> }
> and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK,
> although
> aarch64_can_change_mode_class() has:
> if (BYTES_BIG_ENDIAN)
> ...
> if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE
> (to))
> return false;
> so it should have a similar problem? I', not sure why it doesn't ICE?

I believe that's because there's a pattern in aarch64-sve.md that converts 
everything into a simple set with the right modes forced in.

;; A pattern for handling type punning on big-endian targets.  We use a
;; special predicate for operand 1 to reduce the number of patterns.
(define_insn_and_split "*aarch64_sve_reinterpret<mode>"
  [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
        (unspec:SVE_ALL
          [(match_operand 1 "aarch64_any_register_operand" "w")]
          UNSPEC_REINTERPRET))]
  "TARGET_SVE"
  "#"
  "&& reload_completed"
  [(set (match_dup 0) (match_dup 1))]
  {
    operands[1] = aarch64_replace_reg_mode (operands[1], <MODE>mode);
  }
)

I guess since we don't claim to support big-endian MVE for now we probably 
don't need to handle it, but I wonder whether we should instead
be asserting that targetm.can_change_mode_class is true in the folding code and 
adding a comment that it for future big-endian support it should be handled 
properly in the .md file as on aarch64?

Thanks,
Kyrill


> Thanks,
> Christophe
> 
> 
> 
>       Thanks,
>       Kyrill
> 
> 
>               diff --git a/gcc/config/arm/unspecs.md
> b/gcc/config/arm/unspecs.md
>               index 84384ee798d..dccda283573 100644
>               --- a/gcc/config/arm/unspecs.md
>               +++ b/gcc/config/arm/unspecs.md
>               @@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [
>                  SQRSHRL_64
>                  SQRSHRL_48
>                  VSHLCQ_M_
>               +  REINTERPRET
>                ])
> 
> 

Reply via email to