[PATCH v2][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.
Hi Kyrill, > This patch series depends on upstream patches "Armv8.1-M Mainline Security > Extension" [4], > "CLI and multilib support for Armv8.1-M Mainline MVE extensions" [5] and > "support for Armv8.1-M > Mainline scalar shifts" [6]. Patch (version v1) was approved before. The above patches on which this patch (version v1) depends are committed to trunk last month. (version v1) https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01338.html This patch (Version v2) is re-based on latest trunk resolving few conflicts. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? If ok, please commit on my behalf. I don't have the commit rights. Thanks, Srinath ## Hello, This patch creates the required framework for MVE ACLE intrinsics. The following changes are done in this patch to support MVE ACLE intrinsics. Header file arm_mve.h is added to source code, which contains the definitions of MVE ACLE intrinsics and different data types used in MVE. Machine description file mve.md is also added which contains the RTL patterns defined for MVE. A new reigster "p0" is added which is used in by MVE predicated patterns. A new register class "VPR_REG" is added and its contents are defined in REG_CLASS_CONTENTS. The vec-common.md file is modified to support the standard move patterns. The prefix of neon functions which are also used by MVE is changed from "neon_" to "simd_". eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move. In the patch standard patterns mve_move, mve_store and move_load for MVE are added and neon.md and vfp.md files are modified to support this common patterns. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2020-02-10 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config.gcc (arm_mve.h): Include mve intrinsics header file. * config/arm/aout.h (p0): Add new register name for MVE predicated cases. * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define macro common to Neon and MVE. (ARM_BUILTIN_NEON_LANE_CHECK): Renamed to ARM_BUILTIN_SIMD_LANE_CHECK. (arm_init_simd_builtin_types): Disable poly types for MVE. (arm_init_neon_builtins): Move a check to arm_init_builtins function. (arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of ARM_BUILTIN_NEON_LANE_CHECK. (mve_dereference_pointer): Add function. (arm_expand_builtin_args): Call to mve_dereference_pointer when MVE is enabled. (arm_expand_neon_builtin): Moved to arm_expand_builtin function. (arm_expand_builtin): Moved from arm_expand_neon_builtin function. * config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE and MVE with floating point enabled. * config/arm/arm-protos.h (neon_immediate_valid_for_move): Renamed to simd_immediate_valid_for_move. (simd_immediate_valid_for_move): Renamed from neon_immediate_valid_for_move function. * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Generate error if vfpv2 feature bit is disabled and mve feature bit is also disabled for HARD_FLOAT_ABI. (use_return_insn): Check to not push VFP regs for MVE. (aapcs_vfp_allocate): Add MVE check to have same Procedure Call Standard as Neon. (aapcs_vfp_allocate_return_reg): Likewise. (thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2 address operand for MVE. (arm_rtx_costs_internal): MVE check to determine cost of rtx. (neon_valid_immediate): Rename to simd_valid_immediate. (simd_valid_immediate): Rename from neon_valid_immediate. (simd_valid_immediate): MVE check on size of vector is 128 bits. (neon_immediate_valid_for_move): Rename to simd_immediate_valid_for_move. (simd_immediate_valid_for_move): Rename from neon_immediate_valid_for_move. (neon_immediate_valid_for_logic): Modify call to neon_valid_immediate function. (neon_make_constant): Modify call to neon_valid_immediate function. (neon_vector_mem_operand): Return VFP register for POST_INC or PRE_DEC for MVE. (output_move_neon): Add MVE check to generate vldm/vstm instrcutions. (arm_compute_frame_layout): Calculate space for saved VFP registers for MVE. (arm_save_coproc_regs): Save coproc registers for MVE. (arm_print_operand): Add case 'E' to print memory operands for MVE. (arm_print_operand_address): Check to print register number for MVE. (arm_hard_regno_mode_ok): Check for
[PATCH v2][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, In this patch (v2) all the review comments mentioned in previous patch (v1) are addressed. (v1) https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01395.html # Hello, This patch is part of MVE ACLE intrinsics framework. This patches add support to update (read/write) the APSR (Application Program Status Register) register and FPSCR (Floating-point Status and Control Register) register for MVE. This patch also enables thumb2 mov RTL patterns for MVE. A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns, status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by TARGET_VFP_BASE macro. The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics set/get carry bit of FPSCR register. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2020-20-11 Andre Vieira Mihail Ionescu Srinath Parvathaneni * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base feature bit is on and -mfpu=auto is passed as compiler option, do not generate error on not finding any match fpu. Because in this case fpu is not required. * config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is enabled for MVE and also for all VFP extensions. (VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2 is enabled. (MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em. (MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5 along with feature bits mve_float. (mve): Modify add options in armv8.1-m.main arch for MVE. (mve.fp): Modify add options in armv8.1-m.main arch for MVE with floating point. * config/arm/arm.c (use_return_insn): Replace the check with TARGET_VFP_BASE. (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE. (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as well. (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE as well. (arm_compute_frame_layout): Likewise. (arm_save_coproc_regs): Likewise. (arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM in MVE as well. (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. (arm_expand_epilogue_apcs_frame): Likewise. (arm_expand_epilogue): Likewise. (arm_conditional_register_usage): Likewise. (arm_declare_function_name): Add check to skip printing .fpu directive in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is "softvfp". * config/arm/arm.h (TARGET_VFP_BASE): Define. * config/arm/arm.md (arch): Add "mve" to arch. (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true. (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. * config/arm/constraints.md (Uf): Define for MVE. * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard to not allow for MVE. * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs enum. (VUNSPEC_GET_FPSCR): Define. * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS instructions which move to general-purpose Register from Floating-point Special register and vice-versa. (thumb2_movhi_fp16): Likewise. (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along with MCR and MRC instructions which set and get Floating-point Status and Control Register (FPSCR). (movdi_vfp): Modify pattern to enable Single-precision scalar float move in MVE. (thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar float move patterns in MVE. (thumb2_movsfcc_vfp): Modify pattern to enable single float conditional code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check. (thumb2_movdfcc_vfp): Modify pattern to enable double float conditional code move patterns of VFP also in MVE by addi
[PATCH v2][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, In this patch (v2) all the review comments mentioned in previous patch (v1) are addressed. (v1) https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01401.html # Hello, This patch is part of MVE ACLE intrinsics framework. The patch supports the use of emulation for the single-precision arithmetic operations for MVE. This changes are to support the MVE ACLE intrinsics which operates on vector floating point arithmetic operations. Please refer to Arm reference manual [1] for more details. [1] https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914 Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-11 Andre Vieira Srinath Parvathaneni * config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add emulator calls for dobule precision arithmetic operations for MVE. ### Attachment also inlined for ease of reply### >From af9d1eb4470c26564b69518bbec3fce297501fdd Mon Sep 17 00:00:00 2001 From: Srinath Parvathaneni Date: Tue, 11 Feb 2020 18:42:20 + Subject: [PATCH] [PATCH][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch. --- gcc/config/arm/arm.c | 22 ++- .../gcc.target/arm/mve/intrinsics/mve_libcall1.c | 70 ++ .../gcc.target/arm/mve/intrinsics/mve_libcall2.c | 70 ++ 3 files changed, 159 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c create mode 100644 gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall2.c diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 037f298..e00024b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5754,9 +5754,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) /* Values from double-precision helper functions are returned in core registers if the selected core only supports single-precision arithmetic, even if we are using the hard-float ABI. The same is -true for single-precision helpers, but we will never be using the -hard-float ABI on a CPU which doesn't support single-precision -operations in hardware. */ +true for single-precision helpers except in case of MVE, because in +MVE we will be using the hard-float ABI on a CPU which doesn't support +single-precision operations in hardware. In MVE the following check +enables use of emulation for the single-precision arithmetic +operations. */ + if (TARGET_HAVE_MVE) + { + add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode)); + } add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode)); diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c new file mode 100644 index 000..45f46b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ + +float +foo (float a, float b, float c) +{ + return a + b + c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fadd" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fadd" 2 } } */ + +float +foo1 (float a, float b, float c) +{ + return a - b - c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fsub" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fsub" 2 } } */ + +float +foo2 (float a, float b, float c) +{ + return a * b * c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fmul" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fmul" 2 } } */ + +float +foo3 (float b, float c) +{ + return b / c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fdiv" } } */ + +int +foo4 (float b, float c) +{ + return b < c; +} + +/*
[PATCH v1][GCC] aarch64: Add GCS build attributes support.
This patch adds support for aarch64 gcs build attributes. This support includes generating two new assembler directives .aeabi_subsection and .aeabi_attribute. These directives are generated as per the syntax mentioned in spec "Build Attributes for the ArmĀ® 64-bit Architecture (AArch64)" available at [1]. To check whether the assembler being used to build the toolchain supports these directives, a new gcc configure check is added in gcc/configure.ac. If the assembler support these directives, .aeabi_subsection and .aeabi_attribute directives are emitted in the generated assembly, when -mbranch-protection=gcs is passed. If the assembler does not support these directives, .note.gnu.property section will emit the relevant gcs information in the generated assembly, when -mbranch-protection=gcs is passed. This patch needs to be applied on top of GCC gcs patch series [2]. Bootstrapped on aarch64-none-linux-gnu and regression tested on aarch64-none-elf, no issues. Ok for master? Regards, Srinath. [1]: https://github.com/ARM-software/abi-aa/pull/230 [2]: https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/vendors/ARM/heads/gcs gcc/ChangeLog: 2024-09-11 Srinath Parvathaneni * config.in: Regenerated * config/aarch64/aarch64.cc (aarch64_emit_aeabi_attribute): New function declaration. (aarch64_emit_aeabi_subsection): Likewise. (aarch64_start_file): Emit gcs build attributes. (aarch64_file_end_indicate_exec_stack): Update gcs bit in note.gnu.property section. * configure: Regenerated. * configure.ac: Add gcc configure check. gcc/testsuite/ChangeLog: 2024-09-11 Srinath Parvathaneni * gcc.target/aarch64/build-attribute-gcs.c: New test. --- gcc/config.in | 6 +++ gcc/config/aarch64/a.out | Bin 0 -> 656 bytes gcc/config/aarch64/aarch64.cc | 43 ++ gcc/configure | 35 ++ gcc/configure.ac | 7 +++ .../gcc.target/aarch64/build-attribute-gcs.c | 24 ++ 6 files changed, 115 insertions(+) create mode 100644 gcc/config/aarch64/a.out create mode 100644 gcc/testsuite/gcc.target/aarch64/build-attribute-gcs.c diff --git a/gcc/config.in b/gcc/config.in index 7fcabbe5061..eb6024dfc90 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -379,6 +379,12 @@ #endif +/* Define if your assembler supports gcs build attributes. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_BUILD_ATTRIBUTES_GCS +#endif + + /* Define to the level of your assembler's compressed debug section support. */ #ifndef USED_FOR_TARGET diff --git a/gcc/config/aarch64/a.out b/gcc/config/aarch64/a.out new file mode 100644 index ..dd7982f2db625be166d33548dc5d00c7e7601629 GIT binary patch literal 656 zcmb<-^>JfjWMqH=MuzPS2p&w7f#Cvz$>0EHJ20>_upx>confdefs.h +fi + +# Check if we have binutils support for gcs build attributes. +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for gcs build attributes support" >&5 +$as_echo_n "checking assembler for gcs build attributes support... " >&6; } +if ${gcc_cv_as_aarch64_gcs_build_attributes+:} false; then : + $as_echo_n "(cached) " >&6 +else + gcc_cv_as_aarch64_gcs_build_attributes=no + if test x$gcc_cv_as != x; then +$as_echo ' + .aeabi_subsection .aeabi-feature-and-bits, 1, 0 + .aeabi_attribute 3, 1 +' > conftest.s +if { ac_try='$gcc_cv_as $gcc_cv_as_flags -o conftest.o conftest.s >&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; } +then + gcc_cv_as_aarch64_gcs_build_attributes=yes +else + echo "configure: failed program was" >&5 + cat conftest.s >&5 +fi +rm -f conftest.o conftest.s + fi +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_aarch64_gcs_build_attributes" >&5 +$as_echo "$gcc_cv_as_aarch64_gcs_build_attributes" >&6; } +if test $gcc_cv_as_aarch64_gcs_build_attributes = yes; then + +$as_echo "#define HAVE_AS_BUILD_ATTRIBUTES_GCS 1" >>confdefs.h + fi # Enable Branch Target Identification Mechanism and Return Address diff --git a/gcc/configure.ac b/gcc/configure.ac index d0b9865fc91..51b07417153 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -4388,6 +4388,13 @@ case "$target" in ldr x0, [[x2, #:gotpage_lo15:globalsym]] ],,[AC_DEFINE(HAVE_AS_SMALL_PIC_RELOCS, 1, [Define if your assembler supports relocs needed by -fpic.])]) +# Check if we have binutils support for gcs build attribute
[PATCH][GCC]: Add myself to MAINTAINERS
Hello, Add myself to MAINTAINERS file. Regards, SRI. ChangeLog: 2020-03-05 Srinath Parvathaneni * MAINTAINERS (Write After Approval): Add myself. ### Attachment also inlined for ease of reply### diff --git a/MAINTAINERS b/MAINTAINERS index 5e0ce37b6c511a6abfe0ff36dee0929f7d43e8be..be8016675bd6ccfebb6b78e893a3a840395f7535 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -540,6 +540,7 @@ Peter O'Gorman Andrea Ornstein Maxim Ostapenko Patrick Palka +Srinath Parvathaneni Devang Patel Andris Pavenis Fernando Pereira diff --git a/MAINTAINERS b/MAINTAINERS index 5e0ce37b6c511a6abfe0ff36dee0929f7d43e8be..be8016675bd6ccfebb6b78e893a3a840395f7535 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -540,6 +540,7 @@ Peter O'Gorman Andrea Ornstein Maxim Ostapenko Patrick Palka +Srinath Parvathaneni Devang Patel Andris Pavenis Fernando Pereira
[PATCH v3][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540415.html Hello, This patch creates the required framework for MVE ACLE intrinsics. The following changes are done in this patch to support MVE ACLE intrinsics. Header file arm_mve.h is added to source code, which contains the definitions of MVE ACLE intrinsics and different data types used in MVE. Machine description file mve.md is also added which contains the RTL patterns defined for MVE. A new reigster "p0" is added which is used in by MVE predicated patterns. A new register class "VPR_REG" is added and its contents are defined in REG_CLASS_CONTENTS. The vec-common.md file is modified to support the standard move patterns. The prefix of neon functions which are also used by MVE is changed from "neon_" to "simd_". eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move. In the patch standard patterns mve_move, mve_store and move_load for MVE are added and neon.md and vfp.md files are modified to support this common patterns. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config.gcc (arm_mve.h): Include mve intrinsics header file. * config/arm/aout.h (p0): Add new register name for MVE predicated cases. * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define macro common to Neon and MVE. (ARM_BUILTIN_NEON_LANE_CHECK): Renamed to ARM_BUILTIN_SIMD_LANE_CHECK. (arm_init_simd_builtin_types): Disable poly types for MVE. (arm_init_neon_builtins): Move a check to arm_init_builtins function. (arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of ARM_BUILTIN_NEON_LANE_CHECK. (mve_dereference_pointer): Add function. (arm_expand_builtin_args): Call to mve_dereference_pointer when MVE is enabled. (arm_expand_neon_builtin): Moved to arm_expand_builtin function. (arm_expand_builtin): Moved from arm_expand_neon_builtin function. * config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE and MVE with floating point enabled. * config/arm/arm-protos.h (neon_immediate_valid_for_move): Renamed to simd_immediate_valid_for_move. (simd_immediate_valid_for_move): Renamed from neon_immediate_valid_for_move function. * config/arm/arm.c (arm_options_perform_arch_sanity_checks): Generate error if vfpv2 feature bit is disabled and mve feature bit is also disabled for HARD_FLOAT_ABI. (use_return_insn): Check to not push VFP regs for MVE. (aapcs_vfp_allocate): Add MVE check to have same Procedure Call Standard as Neon. (aapcs_vfp_allocate_return_reg): Likewise. (thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2 address operand for MVE. (arm_rtx_costs_internal): MVE check to determine cost of rtx. (neon_valid_immediate): Rename to simd_valid_immediate. (simd_valid_immediate): Rename from neon_valid_immediate. (simd_valid_immediate): MVE check on size of vector is 128 bits. (neon_immediate_valid_for_move): Rename to simd_immediate_valid_for_move. (simd_immediate_valid_for_move): Rename from neon_immediate_valid_for_move. (neon_immediate_valid_for_logic): Modify call to neon_valid_immediate function. (neon_make_constant): Modify call to neon_valid_immediate function. (neon_vector_mem_operand): Return VFP register for POST_INC or PRE_DEC for MVE. (output_move_neon): Add MVE check to generate vldm/vstm instrcutions. (arm_compute_frame_layout): Calculate space for saved VFP registers for MVE. (arm_save_coproc_regs): Save coproc registers for MVE. (arm_print_operand): Add case 'E' to print memory operands for MVE. (arm_print_operand_address): Check to print register number for MVE. (arm_hard_regno_mode_ok): Check for arm hard regno mode ok for MVE. (arm_modes_tieable_p): Check to allow structure mode for MVE. (arm_regno_class): Add VPR_REGNUM check. (arm_expand_epilogue_apcs_frame): MVE check to calculate epilogue code for APCS frame. (arm_expand_epilogue): MVE check for enabling pop instructions in epilogue. (arm_print_asm_arch_directives): Modify function to disable print of .arch_extension "mve" and "fp" for cases where MVE is enabled with "SOFT FLOAT ABI". (arm_vector_mode_suppor
[PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html Hello, This patch is part of MVE ACLE intrinsics framework. This patches add support to update (read/write) the APSR (Application Program Status Register) register and FPSCR (Floating-point Status and Control Register) register for MVE. This patch also enables thumb2 mov RTL patterns for MVE. A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns, status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by TARGET_VFP_BASE macro. The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics set/get carry bit of FPSCR register. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base feature bit is on and -mfpu=auto is passed as compiler option, do not generate error on not finding any match fpu. Because in this case fpu is not required. * config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is enabled for MVE and also for all VFP extensions. (VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2 is enabled. (MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em. (MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5 along with feature bits mve_float. (mve): Modify add options in armv8.1-m.main arch for MVE. (mve.fp): Modify add options in armv8.1-m.main arch for MVE with floating point. * config/arm/arm.c (use_return_insn): Replace the check with TARGET_VFP_BASE. (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE. (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as well. (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE as well. (arm_compute_frame_layout): Likewise. (arm_save_coproc_regs): Likewise. (arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM in MVE as well. (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. (arm_expand_epilogue_apcs_frame): Likewise. (arm_expand_epilogue): Likewise. (arm_conditional_register_usage): Likewise. (arm_declare_function_name): Add check to skip printing .fpu directive in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is "softvfp". * config/arm/arm.h (TARGET_VFP_BASE): Define. * config/arm/arm.md (arch): Add "mve" to arch. (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true. (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. * config/arm/constraints.md (Uf): Define to allow modification to FPCCR in MVE. * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard to not allow for MVE. * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs enum. (VUNSPEC_GET_FPSCR): Define. * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS instructions which move to general-purpose Register from Floating-point Special register and vice-versa. (thumb2_movhi_fp16): Likewise. (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along with MCR and MRC instructions which set and get Floating-point Status and Control Register (FPSCR). (movdi_vfp): Modify pattern to enable Single-precision scalar float move in MVE. (thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar float move patterns in MVE. (thumb2_movsfcc_vfp): Modify pattern to enable single float conditional code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check. (thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
[PATCH v3][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540417.html Hello, This patch is part of MVE ACLE intrinsics framework. The patch supports the use of emulation for the single-precision arithmetic operations for MVE. This changes are to support the MVE ACLE intrinsics which operates on vector floating point arithmetic operations. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Srinath Parvathaneni * config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add emulator calls for dobule precision arithmetic operations for MVE. 2020-03-06 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/mve_libcall1.c: New test. * gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index c28a475629c7fbad48730beed5550e0cffdf2e1b..40db35a2a8b6dedb4f536b4995e80c8b9a38b588 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5754,9 +5754,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) /* Values from double-precision helper functions are returned in core registers if the selected core only supports single-precision arithmetic, even if we are using the hard-float ABI. The same is -true for single-precision helpers, but we will never be using the -hard-float ABI on a CPU which doesn't support single-precision -operations in hardware. */ +true for single-precision helpers except in case of MVE, because in +MVE we will be using the hard-float ABI on a CPU which doesn't support +single-precision operations in hardware. In MVE the following check +enables use of emulation for the single-precision arithmetic +operations. */ + if (TARGET_HAVE_MVE) + { + add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode)); + } add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode)); diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c new file mode 100644 index ..f89301228c577291fc3095420df1937e1a0c7104 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb -mfpu=auto" } */ + +float +foo (float a, float b, float c) +{ + return a + b + c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fadd" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fadd" 2 } } */ + +float +foo1 (float a, float b, float c) +{ + return a - b - c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fsub" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fsub" 2 } } */ + +float +foo2 (float a, float b, float c) +{ + return a * b * c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fmul" } } */ +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fmul" 2 } } */ + +float +foo3 (float b, float c) +{ + return b / c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fdiv" } } */ + +int +foo4 (float b, float c) +{ + return b < c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmplt" } } */ + +int +foo5 (float b, float c) +{ + return b > c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpgt" } } */ + +int +foo6 (float b, float c) +{ + return b != c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpeq" } } */ + +int +foo7 (float b, float c) +{ +
[PATCH v2][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with unary operand.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534343.html Hello, This patch supports MVE ACLE intrinsics vcvtq_f16_s16, vcvtq_f32_s32, vcvtq_f16_u16, vcvtq_f32_u32n vrndxq_f16, vrndxq_f32, vrndq_f16, vrndq_f32, vrndpq_f16, vrndpq_f32, vrndnq_f16, vrndnq_f32, vrndmq_f16, vrndmq_f32, vrndaq_f16, vrndaq_f32, vrev64q_f16, vrev64q_f32, vnegq_f16, vnegq_f32, vdupq_n_f16, vdupq_n_f32, vabsq_f16, vabsq_f32, vrev32q_f16, vcvttq_f32_f16, vcvtbq_f32_f16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (UNOP_NONE_NONE_QUALIFIERS): Define macro. (UNOP_NONE_SNONE_QUALIFIERS): Likewise. (UNOP_NONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vrndxq_f16): Define macro. (vrndxq_f32): Likewise. (vrndq_f16) Likewise.: (vrndq_f32): Likewise. (vrndpq_f16): Likewise. (vrndpq_f32): Likewise. (vrndnq_f16): Likewise. (vrndnq_f32): Likewise. (vrndmq_f16): Likewise. (vrndmq_f32): Likewise. (vrndaq_f16): Likewise. (vrndaq_f32): Likewise. (vrev64q_f16): Likewise. (vrev64q_f32): Likewise. (vnegq_f16): Likewise. (vnegq_f32): Likewise. (vdupq_n_f16): Likewise. (vdupq_n_f32): Likewise. (vabsq_f16): Likewise. (vabsq_f32): Likewise. (vrev32q_f16): Likewise. (vcvttq_f32_f16): Likewise. (vcvtbq_f32_f16): Likewise. (vcvtq_f16_s16): Likewise. (vcvtq_f32_s32): Likewise. (vcvtq_f16_u16): Likewise. (vcvtq_f32_u32): Likewise. (__arm_vrndxq_f16): Define intrinsic. (__arm_vrndxq_f32): Likewise. (__arm_vrndq_f16): Likewise. (__arm_vrndq_f32): Likewise. (__arm_vrndpq_f16): Likewise. (__arm_vrndpq_f32): Likewise. (__arm_vrndnq_f16): Likewise. (__arm_vrndnq_f32): Likewise. (__arm_vrndmq_f16): Likewise. (__arm_vrndmq_f32): Likewise. (__arm_vrndaq_f16): Likewise. (__arm_vrndaq_f32): Likewise. (__arm_vrev64q_f16): Likewise. (__arm_vrev64q_f32): Likewise. (__arm_vnegq_f16): Likewise. (__arm_vnegq_f32): Likewise. (__arm_vdupq_n_f16): Likewise. (__arm_vdupq_n_f32): Likewise. (__arm_vabsq_f16): Likewise. (__arm_vabsq_f32): Likewise. (__arm_vrev32q_f16): Likewise. (__arm_vcvttq_f32_f16): Likewise. (__arm_vcvtbq_f32_f16): Likewise. (__arm_vcvtq_f16_s16): Likewise. (__arm_vcvtq_f32_s32): Likewise. (__arm_vcvtq_f16_u16): Likewise. (__arm_vcvtq_f32_u32): Likewise. (vrndxq): Define polymorphic variants. (vrndq): Likewise. (vrndpq): Likewise. (vrndnq): Likewise. (vrndmq): Likewise. (vrndaq): Likewise. (vrev64q): Likewise. (vnegq): Likewise. (vabsq): Likewise. (vrev32q): Likewise. (vcvtbq_f32): Likewise. (vcvttq_f32): Likewise. (vcvtq): Likewise. * config/arm/arm_mve_builtins.def (VAR2): Define. (VAR1): Define. * config/arm/mve.md (mve_vrndxq_f): Add RTL pattern. (mve_vrndq_f): Likewise. (mve_vrndpq_f): Likewise. (mve_vrndnq_f): Likewise. (mve_vrndmq_f): Likewise. (mve_vrndaq_f): Likewise. (mve_vrev64q_f): Likewise. (mve_vnegq_f): Likewise. (mve_vdupq_n_f): Likewise. (mve_vabsq_f): Likewise. (mve_vrev32q_fv8hf): Likewise. (mve_vcvttq_f32_f16v4sf): Likewise. (mve_vcvtbq_f32_f16v4sf): Likewise. (mve_vcvtq_to_f_): Likewise. gcc/testsuite/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vabsq_f16.c: New test. * gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtbq_f32_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f16_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f16_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f32_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f32_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvttq_f32_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vnegq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vnegq_f32.c
[PATCH v2][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534328.html Hello, This patch supports MVE ACLE intrinsics vst4q_s8, vst4q_s16, vst4q_s32, vst4q_u8, vst4q_u16, vst4q_u32, vst4q_f16 and vst4q_f32. In this patch arm_mve_builtins.def file is added to the source code in which the builtins for MVE ACLE intrinsics are defined using builtin qualifiers. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (CF): Define mve_builtin_data. (VAR1): Define. (ARM_BUILTIN_MVE_PATTERN_START): Define. (arm_init_mve_builtins): Define function. (arm_init_builtins): Add TARGET_HAVE_MVE check. (arm_expand_builtin_1): Check the range of fcode. (arm_expand_mve_builtin): Define function to expand MVE builtins. (arm_expand_builtin): Check the range of fcode. * config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point types. (__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace. (vst4q_s8): Define macro. (vst4q_s16): Likewise. (vst4q_s32): Likewise. (vst4q_u8): Likewise. (vst4q_u16): Likewise. (vst4q_u32): Likewise. (vst4q_f16): Likewise. (vst4q_f32): Likewise. (__arm_vst4q_s8): Define inline builtin. (__arm_vst4q_s16): Likewise. (__arm_vst4q_s32): Likewise. (__arm_vst4q_u8): Likewise. (__arm_vst4q_u16): Likewise. (__arm_vst4q_u32): Likewise. (__arm_vst4q_f16): Likewise. (__arm_vst4q_f32): Likewise. (__ARM_mve_typeid): Define macro with MVE types. (__ARM_mve_coerce): Define macro with _Generic feature. (vst4q): Define polymorphic variant for different vst4q builtins. * config/arm/arm_mve_builtins.def: New file. * config/arm/iterators.md (VSTRUCT): Modify to allow XI and OI modes in MVE. * config/arm/mve.md (MVE_VLD_ST): Define iterator. (unspec): Define unspec. (mve_vst4q): Define RTL pattern. * config/arm/neon.md (mov): Modify expand to allow XI and OI modes in MVE. (neon_mov): Modify RTL define_insn to allow XI and OI modes in MVE. (define_split): Allow OI mode split for MVE after reload. (define_split): Allow XI mode split for MVE after reload. * config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def. (arm-builtins.o): Likewise. gcc/testsuite/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test. * gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 28917363eeae51b7cc39576f3c3e77a0350b8877..b9ee45d5950ac9c1e12d88cd7d3ece1953dc51d0 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -432,6 +432,13 @@ static arm_builtin_datum neon_builtin_data[] = }; #undef CF +#define CF(N,X) CODE_FOR_mve_##N##X +static arm_builtin_datum mve_builtin_data[] = +{ +#include "arm_mve_builtins.def" +}; + +#undef CF #undef VAR1 #define VAR1(T, N, A) \ {#N, UP (A), CODE_FOR_arm_##N, 0, T##_QUALIFIERS}, @@ -736,6 +743,13 @@ enum arm_builtins #include "arm_acle_builtins.def" + ARM_BUILTIN_MVE_BASE, + +#undef VAR1 +#define VAR1(T, N, X) \ + ARM_BUILTIN_MVE_##N##X, +#include "arm_mve_builtins.def" + ARM_BUILTIN_MAX }; @@ -745,6 +759,9 @@ enum arm_builtins #define ARM_BUILTIN_NEON_PATTERN_START \ (ARM_BUILTIN_NEON_BASE + 1) +#define ARM_BUILTIN_MVE_PATTERN_START \ + (ARM_BUILTIN_MVE_BASE + 1) + #define ARM_BUILTIN_ACLE_PATTERN_START \ (ARM_BUILTIN_ACLE_BASE + 1) @@ -1276,6 +1293,22 @@ arm_init_acle_builtins (void) } } +/* Set up all the MVE builtins mentioned in arm_mve_builtins.def file. */ +static void +arm_init_mve_builtins (void) +{ + volatile unsigned int i, fcode = ARM_BUILTIN_MVE_PATTERN_START; + + arm_init_simd_builtin_scalar_types ();
[PATCH v2][ARM][GCC][1/2x]: MVE intrinsics with binary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534331.html Hello, This patch supports following MVE ACLE intrinsics with binary operand. vsubq_n_f16, vsubq_n_f32, vbrsrq_n_f16, vbrsrq_n_f32, vcvtq_n_f16_s16, vcvtq_n_f32_s32, vcvtq_n_f16_u16, vcvtq_n_f32_u32, vcreateq_f16, vcreateq_f32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraint "Rd" is added, which checks the constant is with in the range of 1 to 16. Also a new predicate "mve_imm_16" is added, to check the the matching constraint Rd. Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_NONE_NONE_NONE_QUALIFIERS): Define qualifier for binary operands. (BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vsubq_n_f16): Define macro. (vsubq_n_f32): Likewise. (vbrsrq_n_f16): Likewise. (vbrsrq_n_f32): Likewise. (vcvtq_n_f16_s16): Likewise. (vcvtq_n_f32_s32): Likewise. (vcvtq_n_f16_u16): Likewise. (vcvtq_n_f32_u32): Likewise. (vcreateq_f16): Likewise. (vcreateq_f32): Likewise. (__arm_vsubq_n_f16): Define intrinsic. (__arm_vsubq_n_f32): Likewise. (__arm_vbrsrq_n_f16): Likewise. (__arm_vbrsrq_n_f32): Likewise. (__arm_vcvtq_n_f16_s16): Likewise. (__arm_vcvtq_n_f32_s32): Likewise. (__arm_vcvtq_n_f16_u16): Likewise. (__arm_vcvtq_n_f32_u32): Likewise. (__arm_vcreateq_f16): Likewise. (__arm_vcreateq_f32): Likewise. (vsubq): Define polymorphic variant. (vbrsrq): Likewise. (vcvtq_n): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE_QUALIFIERS): Use it. (BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise. * config/arm/constraints.md (Rd): Define constraint to check constant is in the range of 1 to 16. * config/arm/mve.md (mve_vsubq_n_f): Define RTL pattern. mve_vbrsrq_n_f: Likewise. mve_vcvtq_n_to_f_: Likewise. mve_vcreateq_f: Likewise. * config/arm/predicates.md (mve_imm_16): Define predicate to check the matching constraint Rd. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vbrsrq_n_f16.c: New test. * gcc.target/arm/mve/intrinsics/vbrsrq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f16_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f16_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f32_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f32_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsubq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vsubq_n_f32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index b61094303443960bb173219e46a4d8e351deeb25..658886831ff55a6fa3350f9a654be4887e115bbc 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -373,6 +373,30 @@ arm_unop_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define UNOP_UNONE_IMM_QUALIFIERS \ (arm_unop_unone_imm_qualifiers) +static enum arm_type_qualifiers +arm_binop_none_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_none }; +#define BINOP_NONE_NONE_NONE_QUALIFIERS \ + (arm_binop_none_none_none_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_immediate }; +#define BINOP_NONE_NONE_IMM_QUALIFIERS \ + (arm_binop_none_none_imm_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned, qualifier_immediate }; +#define BINOP_NONE_UNONE_IMM_QUALIFIERS \ + (arm_binop_none_unone_imm_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned, qualifier_unsigned }; +#define BINOP_NONE_UN
[PATCH v2][ARM][GCC][2/1x]: MVE intrinsics with unary operand.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534326.html Hello, This patch supports following MVE ACLE intrinsics with unary operand. vmvnq_n_s16, vmvnq_n_s32, vrev64q_s8, vrev64q_s16, vrev64q_s32, vcvtq_s16_f16, vcvtq_s32_f32, vrev64q_u8, vrev64q_u16, vrev64q_u32, vmvnq_n_u16, vmvnq_n_u32, vcvtq_u16_f16, vcvtq_u32_f32, vrev64q. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (UNOP_SNONE_SNONE_QUALIFIERS): Define. (UNOP_SNONE_NONE_QUALIFIERS): Likewise. (UNOP_SNONE_IMM_QUALIFIERS): Likewise. (UNOP_UNONE_NONE_QUALIFIERS): Likewise. (UNOP_UNONE_UNONE_QUALIFIERS): Likewise. (UNOP_UNONE_IMM_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vmvnq_n_s16): Define macro. (vmvnq_n_s32): Likewise. (vrev64q_s8): Likewise. (vrev64q_s16): Likewise. (vrev64q_s32): Likewise. (vcvtq_s16_f16): Likewise. (vcvtq_s32_f32): Likewise. (vrev64q_u8): Likewise. (vrev64q_u16): Likewise. (vrev64q_u32): Likewise. (vmvnq_n_u16): Likewise. (vmvnq_n_u32): Likewise. (vcvtq_u16_f16): Likewise. (vcvtq_u32_f32): Likewise. (__arm_vmvnq_n_s16): Define intrinsic. (__arm_vmvnq_n_s32): Likewise. (__arm_vrev64q_s8): Likewise. (__arm_vrev64q_s16): Likewise. (__arm_vrev64q_s32): Likewise. (__arm_vrev64q_u8): Likewise. (__arm_vrev64q_u16): Likewise. (__arm_vrev64q_u32): Likewise. (__arm_vmvnq_n_u16): Likewise. (__arm_vmvnq_n_u32): Likewise. (__arm_vcvtq_s16_f16): Likewise. (__arm_vcvtq_s32_f32): Likewise. (__arm_vcvtq_u16_f16): Likewise. (__arm_vcvtq_u32_f32): Likewise. (vrev64q): Define polymorphic variant. * config/arm/arm_mve_builtins.def (UNOP_SNONE_SNONE): Use it. (UNOP_SNONE_NONE): Likewise. (UNOP_SNONE_IMM): Likewise. (UNOP_UNONE_UNONE): Likewise. (UNOP_UNONE_NONE): Likewise. (UNOP_UNONE_IMM): Likewise. * config/arm/mve.md (mve_vrev64q_): Define RTL pattern. (mve_vcvtq_from_f_): Likewise. (mve_vmvnq_n_): Likewise. gcc/testsuite/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vcvtq_s16_f16.c: New test. * gcc.target/arm/mve/intrinsics/vcvtq_s32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_u16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_u32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 7bf5ef5479722a6cb08cc030e1aa7d6d7fad4599..97354c245ff42e7db2934e1045b8707033511e11 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -337,6 +337,42 @@ arm_unop_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define UNOP_NONE_UNONE_QUALIFIERS \ (arm_unop_none_unone_qualifiers) +static enum arm_type_qualifiers +arm_unop_snone_snone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none }; +#define UNOP_SNONE_SNONE_QUALIFIERS \ + (arm_unop_snone_snone_qualifiers) + +static enum arm_type_qualifiers +arm_unop_snone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none }; +#define UNOP_SNONE_NONE_QUALIFIERS \ + (arm_unop_snone_none_qualifiers) + +static enum arm_type_qualifiers +arm_unop_snone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_immediate }; +#define UNOP_SNONE_IMM_QUALIFIERS \ + (arm_unop_snone_imm_qualifiers) + +static enum arm_type_qualifiers +arm_unop_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_none }; +#define UNOP_UNONE_NONE_QUALIFIERS \ + (arm_unop_unone_none_qualifiers
[PATCH v2][ARM][GCC][5/2x]: MVE intrinsics with binary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534330.html Hello, This patch supports following MVE ACLE intrinsics with binary operands. vqmovntq_u16, vqmovnbq_u16, vmulltq_poly_p8, vmullbq_poly_p8, vmovntq_u16, vmovnbq_u16, vmlaldavxq_u16, vmlaldavq_u16, vqmovuntq_s16, vqmovunbq_s16, vshlltq_n_u8, vshllbq_n_u8, vorrq_n_u16, vbicq_n_u16, vcmpneq_n_f16, vcmpneq_f16, vcmpltq_n_f16, vcmpltq_f16, vcmpleq_n_f16, vcmpleq_f16, vcmpgtq_n_f16, vcmpgtq_f16, vcmpgeq_n_f16, vcmpgeq_f16, vcmpeqq_n_f16, vcmpeqq_f16, vsubq_f16, vqmovntq_s16, vqmovnbq_s16, vqdmulltq_s16, vqdmulltq_n_s16, vqdmullbq_s16, vqdmullbq_n_s16, vorrq_f16, vornq_f16, vmulq_n_f16, vmulq_f16, vmovntq_s16, vmovnbq_s16, vmlsldavxq_s16, vmlsldavq_s16, vmlaldavxq_s16, vmlaldavq_s16, vminnmvq_f16, vminnmq_f16, vminnmavq_f16, vminnmaq_f16, vmaxnmvq_f16, vmaxnmq_f16, vmaxnmavq_f16, vmaxnmaq_f16, veorq_f16, vcmulq_rot90_f16, vcmulq_rot270_f16, vcmulq_rot180_f16, vcmulq_f16, vcaddq_rot90_f16, vcaddq_rot270_f16, vbicq_f16, vandq_f16, vaddq_n_f16, vabdq_f16, vshlltq_n_s8, vshllbq_n_s8, vorrq_n_s16, vbicq_n_s16, vqmovntq_u32, vqmovnbq_u32, vmulltq_poly_p16, vmullbq_poly_p16, vmovntq_u32, vmovnbq_u32, vmlaldavxq_u32, vmlaldavq_u32, vqmovuntq_s32, vqmovunbq_s32, vshlltq_n_u16, vshllbq_n_u16, vorrq_n_u32, vbicq_n_u32, vcmpneq_n_f32, vcmpneq_f32, vcmpltq_n_f32, vcmpltq_f32, vcmpleq_n_f32, vcmpleq_f32, vcmpgtq_n_f32, vcmpgtq_f32, vcmpgeq_n_f32, vcmpgeq_f32, vcmpeqq_n_f32, vcmpeqq_f32, vsubq_f32, vqmovntq_s32, vqmovnbq_s32, vqdmulltq_s32, vqdmulltq_n_s32, vqdmullbq_s32, vqdmullbq_n_s32, vorrq_f32, vornq_f32, vmulq_n_f32, vmulq_f32, vmovntq_s32, vmovnbq_s32, vmlsldavxq_s32, vmlsldavq_s32, vmlaldavxq_s32, vmlaldavq_s32, vminnmvq_f32, vminnmq_f32, vminnmavq_f32, vminnmaq_f32, vmaxnmvq_f32, vmaxnmq_f32, vmaxnmavq_f32, vmaxnmaq_f32, veorq_f32, vcmulq_rot90_f32, vcmulq_rot270_f32, vcmulq_rot180_f32, vcmulq_f32, vcaddq_rot90_f32, vcaddq_rot270_f32, vbicq_f32, vandq_f32, vaddq_n_f32, vabdq_f32, vshlltq_n_s16, vshllbq_n_s16, vorrq_n_s32, vbicq_n_s32, vrmlaldavhq_u32, vctp8q_m, vctp64q_m, vctp32q_m, vctp16q_m, vaddlvaq_u32, vrmlsldavhxq_s32, vrmlsldavhq_s32, vrmlaldavhxq_s32, vrmlaldavhq_s32, vcvttq_f16_f32, vcvtbq_f16_f32, vaddlvaq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics The above intrinsics are defined using the already defined builtin qualifiers BINOP_NONE_NONE_IMM, BINOP_NONE_NONE_NONE, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, BINOP_UNONE_UNONE_NONE, BINOP_UNONE_UNONE_UNONE. Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vqmovntq_u16): Define macro. (vqmovnbq_u16): Likewise. (vmulltq_poly_p8): Likewise. (vmullbq_poly_p8): Likewise. (vmovntq_u16): Likewise. (vmovnbq_u16): Likewise. (vmlaldavxq_u16): Likewise. (vmlaldavq_u16): Likewise. (vqmovuntq_s16): Likewise. (vqmovunbq_s16): Likewise. (vshlltq_n_u8): Likewise. (vshllbq_n_u8): Likewise. (vorrq_n_u16): Likewise. (vbicq_n_u16): Likewise. (vcmpneq_n_f16): Likewise. (vcmpneq_f16): Likewise. (vcmpltq_n_f16): Likewise. (vcmpltq_f16): Likewise. (vcmpleq_n_f16): Likewise. (vcmpleq_f16): Likewise. (vcmpgtq_n_f16): Likewise. (vcmpgtq_f16): Likewise. (vcmpgeq_n_f16): Likewise. (vcmpgeq_f16): Likewise. (vcmpeqq_n_f16): Likewise. (vcmpeqq_f16): Likewise. (vsubq_f16): Likewise. (vqmovntq_s16): Likewise. (vqmovnbq_s16): Likewise. (vqdmulltq_s16): Likewise. (vqdmulltq_n_s16): Likewise. (vqdmullbq_s16): Likewise. (vqdmullbq_n_s16): Likewise. (vorrq_f16): Likewise. (vornq_f16): Likewise. (vmulq_n_f16): Likewise. (vmulq_f16): Likewise. (vmovntq_s16): Likewise. (vmovnbq_s16): Likewise. (vmlsldavxq_s16): Likewise. (vmlsldavq_s16): Likewise. (vmlaldavxq_s16): Likewise. (vmlaldavq_s16): Likewise. (vminnmvq_f16): Likewise. (vminnmq_f16): Likewise. (vminnmavq_f16): Likewise. (vminnmaq_f16): Likewise. (vmaxnmvq_f16): Likewise. (vmaxnmq_f16): Likewise. (vmaxnmavq_f16): Likewise. (vmaxnmaq_f16): Likewise. (veorq_f16): Likewise. (vcmulq_rot90_f16): Likewise. (vcmulq_rot270_f16): Likewise. (vcmulq_rot180_f16): Likewise. (vcmulq_f16): Likewise. (vcaddq_rot90_f16): Likewise. (vcaddq_rot270_f16): Likewise. (vbicq_f16): Likewise
[PATCH v2][ARM][GCC][4/1x]: MVE intrinsics with unary operand.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534342.html Hello, This patch supports following MVE ACLE intrinsics with unary operand. vctp16q, vctp32q, vctp64q, vctp8q, vpnot. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics There are few conflicts in defining the machine registers, resolved by re-ordering VPR_REGNUM, APSRQ_REGNUM and APSRGE_REGNUM. Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (hi_UP): Define mode. * config/arm/arm.h (IS_VPR_REGNUM): Move. * config/arm/arm.md (VPR_REGNUM): Define before APSRQ_REGNUM. (APSRQ_REGNUM): Modify. (APSRGE_REGNUM): Modify. * config/arm/arm_mve.h (vctp16q): Define macro. (vctp32q): Likewise. (vctp64q): Likewise. (vctp8q): Likewise. (vpnot): Likewise. (__arm_vctp16q): Define intrinsic. (__arm_vctp32q): Likewise. (__arm_vctp64q): Likewise. (__arm_vctp8q): Likewise. (__arm_vpnot): Likewise. * config/arm/arm_mve_builtins.def (UNOP_UNONE_UNONE): Use builtin qualifier. * config/arm/mve.md (mve_vctpqhi): Define RTL pattern. (mve_vpnothi): Likewise. gcc/testsuite/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vctp16q.c: New test. * gcc.target/arm/mve/intrinsics/vctp32q.c: Likewise. * gcc.target/arm/mve/intrinsics/vctp64q.c: Likewise. * gcc.target/arm/mve/intrinsics/vctp8q.c: Likewise. * gcc.target/arm/mve/intrinsics/vpnot.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 0d5da5ea3133ca39b55b4ca76507e01956997c03..fd449458bcb1f9a899a16e432aa015d48b665868 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -415,6 +415,7 @@ arm_set_sat_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define hf_UP E_HFmode #define bf_UPE_BFmode #define si_UP E_SImode +#define hi_UPE_HImode #define void_UP E_VOIDmode #define sf_UP E_SFmode #define UP(X) X##_UP diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 912849f0acd36e9c8c3a00f4253a691b7085e72d..7f94e11c0ea23dfbdb7e64fafbdfc68d83411865 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -192,6 +192,11 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vcvtmq_u32_f32(__a) __arm_vcvtmq_u32_f32(__a) #define vcvtaq_u16_f16(__a) __arm_vcvtaq_u16_f16(__a) #define vcvtaq_u32_f32(__a) __arm_vcvtaq_u32_f32(__a) +#define vctp16q(__a) __arm_vctp16q(__a) +#define vctp32q(__a) __arm_vctp32q(__a) +#define vctp64q(__a) __arm_vctp64q(__a) +#define vctp8q(__a) __arm_vctp8q(__a) +#define vpnot(__a) __arm_vpnot(__a) #endif __extension__ extern __inline void @@ -703,6 +708,41 @@ __arm_vaddlvq_u32 (uint32x4_t __a) return __builtin_mve_vaddlvq_uv4si (__a); } +__extension__ extern __inline int64_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vctp16q (uint32_t __a) +{ + return __builtin_mve_vctp16qhi (__a); +} + +__extension__ extern __inline mve_pred16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vctp32q (uint32_t __a) +{ + return __builtin_mve_vctp32qhi (__a); +} + +__extension__ extern __inline mve_pred16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vctp64q (uint32_t __a) +{ + return __builtin_mve_vctp64qhi (__a); +} + +__extension__ extern __inline mve_pred16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vctp8q (uint32_t __a) +{ + return __builtin_mve_vctp8qhi (__a); +} + +__extension__ extern __inline mve_pred16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vpnot (mve_pred16_t __a) +{ + return __builtin_mve_vpnothi (__a); +} + #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 44807d6e8c4a4717c4f2fd2ef7015708ca3af4bc..5d5696965457e4fe138c194d7f3c3c5737bf68d0 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -71,3 +71,8 @@ VAR2 (UNOP_UNONE_NONE, vcvtaq_u, v8hi, v4si) VAR2 (UNOP_UNONE_IMM, vmvnq_n_u, v8hi, v4si) VAR1 (UNOP_UNONE_UNONE, vrev16q_u, v16qi) VAR1 (UNOP_UNONE_UNONE, vaddlvq_u, v4si) +VAR1 (UNOP_UNONE_UNONE, vctp16q, hi) +VAR1 (UNOP_UNONE_UNONE, vct
[PATCH v2][ARM][GCC][3/2x]: MVE intrinsics with binary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534329.html Hello, This patch supports following MVE ACLE intrinsics with binary operands. vaddlvq_p_s32, vaddlvq_p_u32, vcmpneq_s8, vcmpneq_s16, vcmpneq_s32, vcmpneq_u8, vcmpneq_u16, vcmpneq_u32, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_u8, vshlq_u16, vshlq_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_NONE_NONE_UNONE_QUALIFIERS): Define qualifier for binary operands. (BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise. (BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vaddlvq_p_s32): Define macro. (vaddlvq_p_u32): Likewise. (vcmpneq_s8): Likewise. (vcmpneq_s16): Likewise. (vcmpneq_s32): Likewise. (vcmpneq_u8): Likewise. (vcmpneq_u16): Likewise. (vcmpneq_u32): Likewise. (vshlq_s8): Likewise. (vshlq_s16): Likewise. (vshlq_s32): Likewise. (vshlq_u8): Likewise. (vshlq_u16): Likewise. (vshlq_u32): Likewise. (__arm_vaddlvq_p_s32): Define intrinsic. (__arm_vaddlvq_p_u32): Likewise. (__arm_vcmpneq_s8): Likewise. (__arm_vcmpneq_s16): Likewise. (__arm_vcmpneq_s32): Likewise. (__arm_vcmpneq_u8): Likewise. (__arm_vcmpneq_u16): Likewise. (__arm_vcmpneq_u32): Likewise. (__arm_vshlq_s8): Likewise. (__arm_vshlq_s16): Likewise. (__arm_vshlq_s32): Likewise. (__arm_vshlq_u8): Likewise. (__arm_vshlq_u16): Likewise. (__arm_vshlq_u32): Likewise. (vaddlvq_p): Define polymorphic variant. (vcmpneq): Likewise. (vshlq): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_UNONE_QUALIFIERS): Use it. (BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise. (BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise. * config/arm/mve.md (mve_vaddlvq_p_v4si): Define RTL pattern. (mve_vcmpneq_): Likewise. (mve_vshlq_): Likewise. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: New test. * gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 5db4db498b71224a4b5a0f5b6aa3476b351f7fd3..264006bc8645d8e73149961debad8eb82bab8af0 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -415,6 +415,24 @@ arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define BINOP_UNONE_NONE_IMM_QUALIFIERS \ (arm_binop_unone_none_imm_qualifiers) +static enum arm_type_qualifiers +arm_binop_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_unsigned }; +#define BINOP_NONE_NONE_UNONE_QUALIFIERS \ + (arm_binop_none_none_unone_qualifiers) + +static enum arm_type_qualifiers +arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_none, qualifier_none }; +#define BINOP_UNONE_NONE_NONE_QUALIFIERS \ + (arm_binop_unone_none_none_qualifiers) + +static enum arm_type_qualifiers +arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_none }; +#define BINOP_UNONE_UNONE_NONE_QUALIFIERS \ + (arm_binop_unone_unone_none_qualifiers) + /* End of Qualifier for MVE builtins. */ /* void ([T element type] *, T, immediate). */ diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 37d5b1c0c4914efab1b765298a7ce15726e45183
[PATCH v2][ARM][GCC][3/1x]: MVE intrinsics with unary operand.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534321.html Hello, This patch supports following MVE ACLE intrinsics with unary operand. vdupq_n_s8, vdupq_n_s16, vdupq_n_s32, vabsq_s8, vabsq_s16, vabsq_s32, vclsq_s8, vclsq_s16, vclsq_s32, vclzq_s8, vclzq_s16, vclzq_s32, vnegq_s8, vnegq_s16, vnegq_s32, vaddlvq_s32, vaddvq_s8, vaddvq_s16, vaddvq_s32, vmovlbq_s8, vmovlbq_s16, vmovltq_s8, vmovltq_s16, vmvnq_s8, vmvnq_s16, vmvnq_s32, vrev16q_s8, vrev32q_s8, vrev32q_s16, vqabsq_s8, vqabsq_s16, vqabsq_s32, vqnegq_s8, vqnegq_s16, vqnegq_s32, vcvtaq_s16_f16, vcvtaq_s32_f32, vcvtnq_s16_f16, vcvtnq_s32_f32, vcvtpq_s16_f16, vcvtpq_s32_f32, vcvtmq_s16_f16, vcvtmq_s32_f32, vmvnq_u8, vmvnq_u16, vmvnq_u32, vdupq_n_u8, vdupq_n_u16, vdupq_n_u32, vclzq_u8, vclzq_u16, vclzq_u32, vaddvq_u8, vaddvq_u16, vaddvq_u32, vrev32q_u8, vrev32q_u16, vmovltq_u8, vmovltq_u16, vmovlbq_u8, vmovlbq_u16, vrev16q_u8, vaddlvq_u32, vcvtpq_u16_f16, vcvtpq_u32_f32, vcvtnq_u16_f16, vcvtmq_u16_f16, vcvtmq_u32_f32, vcvtaq_u16_f16, vcvtaq_u32_f32, vdupq_n, vabsq, vclsq, vclzq, vnegq, vaddlvq, vaddvq, vmovlbq, vmovltq, vmvnq, vrev16q, vrev32q, vqabsq, vqneg q. A new register class "EVEN_REGS" which allows only even registers is added in this patch. The new constraint "e" allows only reigsters of EVEN_REGS class. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm.h (enum reg_class): Define new class EVEN_REGS. * config/arm/arm_mve.h (vdupq_n_s8): Define macro. (vdupq_n_s16): Likewise. (vdupq_n_s32): Likewise. (vabsq_s8): Likewise. (vabsq_s16): Likewise. (vabsq_s32): Likewise. (vclsq_s8): Likewise. (vclsq_s16): Likewise. (vclsq_s32): Likewise. (vclzq_s8): Likewise. (vclzq_s16): Likewise. (vclzq_s32): Likewise. (vnegq_s8): Likewise. (vnegq_s16): Likewise. (vnegq_s32): Likewise. (vaddlvq_s32): Likewise. (vaddvq_s8): Likewise. (vaddvq_s16): Likewise. (vaddvq_s32): Likewise. (vmovlbq_s8): Likewise. (vmovlbq_s16): Likewise. (vmovltq_s8): Likewise. (vmovltq_s16): Likewise. (vmvnq_s8): Likewise. (vmvnq_s16): Likewise. (vmvnq_s32): Likewise. (vrev16q_s8): Likewise. (vrev32q_s8): Likewise. (vrev32q_s16): Likewise. (vqabsq_s8): Likewise. (vqabsq_s16): Likewise. (vqabsq_s32): Likewise. (vqnegq_s8): Likewise. (vqnegq_s16): Likewise. (vqnegq_s32): Likewise. (vcvtaq_s16_f16): Likewise. (vcvtaq_s32_f32): Likewise. (vcvtnq_s16_f16): Likewise. (vcvtnq_s32_f32): Likewise. (vcvtpq_s16_f16): Likewise. (vcvtpq_s32_f32): Likewise. (vcvtmq_s16_f16): Likewise. (vcvtmq_s32_f32): Likewise. (vmvnq_u8): Likewise. (vmvnq_u16): Likewise. (vmvnq_u32): Likewise. (vdupq_n_u8): Likewise. (vdupq_n_u16): Likewise. (vdupq_n_u32): Likewise. (vclzq_u8): Likewise. (vclzq_u16): Likewise. (vclzq_u32): Likewise. (vaddvq_u8): Likewise. (vaddvq_u16): Likewise. (vaddvq_u32): Likewise. (vrev32q_u8): Likewise. (vrev32q_u16): Likewise. (vmovltq_u8): Likewise. (vmovltq_u16): Likewise. (vmovlbq_u8): Likewise. (vmovlbq_u16): Likewise. (vrev16q_u8): Likewise. (vaddlvq_u32): Likewise. (vcvtpq_u16_f16): Likewise. (vcvtpq_u32_f32): Likewise. (vcvtnq_u16_f16): Likewise. (vcvtmq_u16_f16): Likewise. (vcvtmq_u32_f32): Likewise. (vcvtaq_u16_f16): Likewise. (vcvtaq_u32_f32): Likewise. (__arm_vdupq_n_s8): Define intrinsic. (__arm_vdupq_n_s16): Likewise. (__arm_vdupq_n_s32): Likewise. (__arm_vabsq_s8): Likewise. (__arm_vabsq_s16): Likewise. (__arm_vabsq_s32): Likewise. (__arm_vclsq_s8): Likewise. (__arm_vclsq_s16): Likewise. (__arm_vclsq_s32): Likewise. (__arm_vclzq_s8): Likewise. (__arm_vclzq_s16): Likewise. (__arm_vclzq_s32): Likewise. (__arm_vnegq_s8): Likewise. (__arm_vnegq_s16): Likewise. (__arm_vnegq_s32): Likewise. (__arm_vaddlvq_s32): Likewise. (__arm_vaddvq_s8): Likewise. (__arm_vaddvq_s16): Likewise. (__arm_vaddvq_s32): Likewise. (__arm_vmovlbq_s8): Likewise. (__arm_vm
[PATCH v2][ARM][GCC][2/2x]: MVE intrinsics with binary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534325.html Hello, This patch supports following MVE ACLE intrinsics with binary operands. vcvtq_n_s16_f16, vcvtq_n_s32_f32, vcvtq_n_u16_f16, vcvtq_n_u32_f32, vcreateq_u8, vcreateq_u16, vcreateq_u32, vcreateq_u64, vcreateq_s8, vcreateq_s16, vcreateq_s32, vcreateq_s64, vshrq_n_s8, vshrq_n_s16, vshrq_n_s32, vshrq_n_u8, vshrq_n_u16, vshrq_n_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rb" and "Rf" are added, which checks the constant is with in the range of 1 to 8 and 1 to 32 respectively. Also a new predicates "mve_imm_8" and "mve_imm_32" are added, to check the the matching constraint Rb and Rf respectively. Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for binary operands. (BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vcvtq_n_s16_f16): Define macro. (vcvtq_n_s32_f32): Likewise. (vcvtq_n_u16_f16): Likewise. (vcvtq_n_u32_f32): Likewise. (vcreateq_u8): Likewise. (vcreateq_u16): Likewise. (vcreateq_u32): Likewise. (vcreateq_u64): Likewise. (vcreateq_s8): Likewise. (vcreateq_s16): Likewise. (vcreateq_s32): Likewise. (vcreateq_s64): Likewise. (vshrq_n_s8): Likewise. (vshrq_n_s16): Likewise. (vshrq_n_s32): Likewise. (vshrq_n_u8): Likewise. (vshrq_n_u16): Likewise. (vshrq_n_u32): Likewise. (__arm_vcreateq_u8): Define intrinsic. (__arm_vcreateq_u16): Likewise. (__arm_vcreateq_u32): Likewise. (__arm_vcreateq_u64): Likewise. (__arm_vcreateq_s8): Likewise. (__arm_vcreateq_s16): Likewise. (__arm_vcreateq_s32): Likewise. (__arm_vcreateq_s64): Likewise. (__arm_vshrq_n_s8): Likewise. (__arm_vshrq_n_s16): Likewise. (__arm_vshrq_n_s32): Likewise. (__arm_vshrq_n_u8): Likewise. (__arm_vshrq_n_u16): Likewise. (__arm_vshrq_n_u32): Likewise. (__arm_vcvtq_n_s16_f16): Likewise. (__arm_vcvtq_n_s32_f32): Likewise. (__arm_vcvtq_n_u16_f16): Likewise. (__arm_vcvtq_n_u32_f32): Likewise. (vshrq_n): Define polymorphic variant. * config/arm/arm_mve_builtins.def (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Use it. (BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise. * config/arm/constraints.md (Rb): Define constraint to check constant is in the range of 1 to 8. (Rf): Define constraint to check constant is in the range of 1 to 32. * config/arm/mve.md (mve_vcreateq_): Define RTL pattern. (mve_vshrq_n_): Likewise. (mve_vcvtq_n_from_f_): Likewise. * config/arm/predicates.md (mve_imm_8): Define predicate to check the matching constraint Rb. (mve_imm_32): Define predicate to check the matching constraint Rf. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vcreateq_s16.c: New test. * gcc.target/arm/mve/intrinsics/vcreateq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_s16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_s32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_u16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_u32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-bu
[PATCH v2][ARM][GCC][4/2x]: MVE intrinsics with binary operands.
, vqshlq_n_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Ra" and "Rg" are added. Ra checks the constant is with in the range of 0 to 7 where as Rg checks that the constant is one among 1, 2, 4 and 8. Also a new predicates "mve_imm_7" and "mve_imm_selective_upto_8" are added, to check the the matching constraint Ra and Rg respectively. The above intrinsics are defined using the already defined builtin qualifiers BINOP_NONE_NONE_IMM, BINOP_NONE_NONE_NONE, BINOP_NONE_NONE_UNONE, BINOP_UNONE_NONE_IMM, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, BINOP_UNONE_UNONE_NONE, BINOP_UNONE_UNONE_UNONE. Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vsubq_u8): Define macro. (vsubq_n_u8): Likewise. (vrmulhq_u8): Likewise. (vrhaddq_u8): Likewise. (vqsubq_u8): Likewise. (vqsubq_n_u8): Likewise. (vqaddq_u8): Likewise. (vqaddq_n_u8): Likewise. (vorrq_u8): Likewise. (vornq_u8): Likewise. (vmulq_u8): Likewise. (vmulq_n_u8): Likewise. (vmulltq_int_u8): Likewise. (vmullbq_int_u8): Likewise. (vmulhq_u8): Likewise. (vmladavq_u8): Likewise. (vminvq_u8): Likewise. (vminq_u8): Likewise. (vmaxvq_u8): Likewise. (vmaxq_u8): Likewise. (vhsubq_u8): Likewise. (vhsubq_n_u8): Likewise. (vhaddq_u8): Likewise. (vhaddq_n_u8): Likewise. (veorq_u8): Likewise. (vcmpneq_n_u8): Likewise. (vcmphiq_u8): Likewise. (vcmphiq_n_u8): Likewise. (vcmpeqq_u8): Likewise. (vcmpeqq_n_u8): Likewise. (vcmpcsq_u8): Likewise. (vcmpcsq_n_u8): Likewise. (vcaddq_rot90_u8): Likewise. (vcaddq_rot270_u8): Likewise. (vbicq_u8): Likewise. (vandq_u8): Likewise. (vaddvq_p_u8): Likewise. (vaddvaq_u8): Likewise. (vaddq_n_u8): Likewise. (vabdq_u8): Likewise. (vshlq_r_u8): Likewise. (vrshlq_u8): Likewise. (vrshlq_n_u8): Likewise. (vqshlq_u8): Likewise. (vqshlq_r_u8): Likewise. (vqrshlq_u8): Likewise. (vqrshlq_n_u8): Likewise. (vminavq_s8): Likewise. (vminaq_s8): Likewise. (vmaxavq_s8): Likewise. (vmaxaq_s8): Likewise. (vbrsrq_n_u8): Likewise. (vshlq_n_u8): Likewise. (vrshrq_n_u8): Likewise. (vqshlq_n_u8): Likewise. (vcmpneq_n_s8): Likewise. (vcmpltq_s8): Likewise. (vcmpltq_n_s8): Likewise. (vcmpleq_s8): Likewise. (vcmpleq_n_s8): Likewise. (vcmpgtq_s8): Likewise. (vcmpgtq_n_s8): Likewise. (vcmpgeq_s8): Likewise. (vcmpgeq_n_s8): Likewise. (vcmpeqq_s8): Likewise. (vcmpeqq_n_s8): Likewise. (vqshluq_n_s8): Likewise. (vaddvq_p_s8): Likewise. (vsubq_s8): Likewise. (vsubq_n_s8): Likewise. (vshlq_r_s8): Likewise. (vrshlq_s8): Likewise. (vrshlq_n_s8): Likewise. (vrmulhq_s8): Likewise. (vrhaddq_s8): Likewise. (vqsubq_s8): Likewise. (vqsubq_n_s8): Likewise. (vqshlq_s8): Likewise. (vqshlq_r_s8): Likewise. (vqrshlq_s8): Likewise. (vqrshlq_n_s8): Likewise. (vqrdmulhq_s8): Likewise. (vqrdmulhq_n_s8): Likewise. (vqdmulhq_s8): Likewise. (vqdmulhq_n_s8): Likewise. (vqaddq_s8): Likewise. (vqaddq_n_s8): Likewise. (vorrq_s8): Likewise. (vornq_s8): Likewise. (vmulq_s8): Likewise. (vmulq_n_s8): Likewise. (vmulltq_int_s8): Likewise. (vmullbq_int_s8): Likewise. (vmulhq_s8): Likewise. (vmlsdavxq_s8): Likewise. (vmlsdavq_s8): Likewise. (vmladavxq_s8): Likewise. (vmladavq_s8): Likewise. (vminvq_s8): Likewise. (vminq_s8): Likewise. (vmaxvq_s8): Likewise. (vmaxq_s8): Likewise. (vhsubq_s8): Likewise. (vhsubq_n_s8): Likewise. (vhcaddq_rot90_s8): Likewise. (vhcaddq_rot270_s8): Likewise. (vhaddq_s8): Likewise. (vhaddq_n_s8): Likewise. (veorq_s8): Likewise. (vcaddq_rot90_s8): Likewise. (vcaddq_rot270_s8): Likewise. (vbrsrq_n_s8): Likewise. (vbicq_s8): Likewise. (vandq_s8): Likewise. (vaddvaq_s8): Likewise. (vaddq_n_s8): Likewise. (vabdq_s8): Likewise. (vshlq_n_s8): Likewise. (vrshrq_n_s8): Likewise. (vqshlq_n_s8): Likewise. (vsubq_u16): L
[PATCH v4][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html Please ignore version v3. (version v3) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541783.html ### Hello, This patch is part of MVE ACLE intrinsics framework. This patches add support to update (read/write) the APSR (Application Program Status Register) register and FPSCR (Floating-point Status and Control Register) register for MVE. This patch also enables thumb2 mov RTL patterns for MVE. A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns, status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by TARGET_VFP_BASE macro. The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics set/get carry bit of FPSCR register. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2020-03-06 Andre Vieira Mihail Ionescu Srinath Parvathaneni * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base feature bit is on and -mfpu=auto is passed as compiler option, do not generate error on not finding any matching fpu. Because in this case fpu is not required. * config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is enabled for MVE and also for all VFP extensions. (VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2 is enabled. (MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em. (MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5 along with feature bits mve_float. (mve): Modify add options in armv8.1-m.main arch for MVE. (mve.fp): Modify add options in armv8.1-m.main arch for MVE with floating point. * config/arm/arm.c (use_return_insn): Replace the check with TARGET_VFP_BASE. (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE. (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as well. (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE as well. (arm_compute_frame_layout): Likewise. (arm_save_coproc_regs): Likewise. (arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM in MVE as well. (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. (arm_expand_epilogue_apcs_frame): Likewise. (arm_expand_epilogue): Likewise. (arm_conditional_register_usage): Likewise. (arm_declare_function_name): Add check to skip printing .fpu directive in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is "softvfp". * config/arm/arm.h (TARGET_VFP_BASE): Define. * config/arm/arm.md (arch): Add "mve" to arch. (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true. (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. * config/arm/constraints.md (Uf): Define to allow modification to FPCCR in MVE. * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard to not allow for MVE. * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs enum. (VUNSPEC_GET_FPSCR): Define. * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS instructions which move to general-purpose Register from Floating-point Special register and vice-versa. (thumb2_movhi_fp16): Likewise. (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along with MCR and MRC instructions which set and get Floating-point Status and Control Register (FPSCR). (movdi_vfp): Modify pattern to enable Single-precision scalar float move in MVE. (thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar float move patterns in MVE. (thumb2_movsfcc_vfp): Modify pattern to enable single float conditional code move patterns of VFP also in MVE by adding TARGET_VFP_BAS
Re: [PATCH v3][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.
Hi Kyrill, > This is ok but please bootstrap it on arm-none-linux-gnueabihf as well. I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues. There is problem with git commit rights, could you commit this patch on my behalf. Regards SRI From: Kyrill Tkachov Sent: 12 March 2020 11:15 To: Srinath Parvathaneni ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH v3][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch. Hi Srinath, On 3/10/20 6:19 PM, Srinath Parvathaneni wrote: > Hello Kyrill, > > This patch addresses all the comments in patch version v2. > (version v2) > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540415.html > > > > Hello, > > This patch creates the required framework for MVE ACLE intrinsics. > > The following changes are done in this patch to support MVE ACLE > intrinsics. > > Header file arm_mve.h is added to source code, which contains the > definitions of MVE ACLE intrinsics > and different data types used in MVE. Machine description file mve.md > is also added which contains the > RTL patterns defined for MVE. > > A new reigster "p0" is added which is used in by MVE predicated > patterns. A new register class "VPR_REG" > is added and its contents are defined in REG_CLASS_CONTENTS. > > The vec-common.md file is modified to support the standard move > patterns. The prefix of neon functions > which are also used by MVE is changed from "neon_" to "simd_". > eg: neon_immediate_valid_for_move changed to > simd_immediate_valid_for_move. > > In the patch standard patterns mve_move, mve_store and move_load for > MVE are added and neon.md and vfp.md > files are modified to support this common patterns. > > Please refer to Arm reference manual [1] for more details. > > [1] https://developer.arm.com/docs/ddi0553/latest > > Regression tested on target arm-none-eabi and armeb-none-eabi and > found no regressions. > > Ok for trunk? This is ok but please bootstrap it on arm-none-linux-gnueabihf as well. Thanks, Kyrill > > Thanks, > Srinath > > gcc/ChangeLog: > > 2020-03-06 Andre Vieira > Mihail Ionescu > Srinath Parvathaneni > > * config.gcc (arm_mve.h): Include mve intrinsics header file. > * config/arm/aout.h (p0): Add new register name for MVE predicated > cases. > * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define > macro > common to Neon and MVE. > (ARM_BUILTIN_NEON_LANE_CHECK): Renamed to > ARM_BUILTIN_SIMD_LANE_CHECK. > (arm_init_simd_builtin_types): Disable poly types for MVE. > (arm_init_neon_builtins): Move a check to arm_init_builtins > function. > (arm_init_builtins): Use ARM_BUILTIN_SIMD_LANE_CHECK instead of > ARM_BUILTIN_NEON_LANE_CHECK. > (mve_dereference_pointer): Add function. > (arm_expand_builtin_args): Call to mve_dereference_pointer > when MVE is > enabled. > (arm_expand_neon_builtin): Moved to arm_expand_builtin function. > (arm_expand_builtin): Moved from arm_expand_neon_builtin function. > * config/arm/arm-c.c (__ARM_FEATURE_MVE): Define macro for MVE > and MVE > with floating point enabled. > * config/arm/arm-protos.h (neon_immediate_valid_for_move): > Renamed to > simd_immediate_valid_for_move. > (simd_immediate_valid_for_move): Renamed from > neon_immediate_valid_for_move function. > * config/arm/arm.c (arm_options_perform_arch_sanity_checks): > Generate > error if vfpv2 feature bit is disabled and mve feature bit is also > disabled for HARD_FLOAT_ABI. > (use_return_insn): Check to not push VFP regs for MVE. > (aapcs_vfp_allocate): Add MVE check to have same Procedure > Call Standard > as Neon. > (aapcs_vfp_allocate_return_reg): Likewise. > (thumb2_legitimate_address_p): Check to return 0 on valid Thumb-2 > address operand for MVE. > (arm_rtx_costs_internal): MVE check to determine cost of rtx. > (neon_valid_immediate): Rename to simd_valid_immediate. > (simd_valid_immediate): Rename from neon_valid_immediate. > (simd_valid_immediate): MVE check on size of vector is 128 bits. > (neon_immediate_valid_for_move): Rename to > simd_immediate_valid_for_move. > (simd_immediate_valid_for_move): Rename from > neon_immediate_valid_for_move. > (neon_immediate_valid_for_logic): Modify call to > neon_valid_immediate > function. > (neon_make_constant): Modify call to neon_valid_immediate > f
Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hi Kyrill, > Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues. There is problem with git commit rights, could you commit this patch on my behalf. Regards SRI. From: Kyrill Tkachov Sent: 12 March 2020 11:16 To: Srinath Parvathaneni ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. Hi Srinath, On 3/10/20 6:19 PM, Srinath Parvathaneni wrote: > Hello Kyrill, > > This patch addresses all the comments in patch version v2. > (version v2) > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html > > > > > Hello, > > This patch is part of MVE ACLE intrinsics framework. > This patches add support to update (read/write) the APSR (Application > Program Status Register) > register and FPSCR (Floating-point Status and Control Register) > register for MVE. > This patch also enables thumb2 mov RTL patterns for MVE. > > A new feature bit vfp_base is added. This bit is enabled for all VFP, > MVE and MVE with floating point > extensions. This bit is used to enable the macro TARGET_VFP_BASE. For > all the VFP instructions, RTL patterns, > status and control registers are guarded by TARGET_HAVE_FLOAT. But > this patch modifies that and the > common instructions, RTL patterns, status and control registers > bewteen MVE and VFP are guarded by > TARGET_VFP_BASE macro. > > The RTL pattern set_fpscr and get_fpscr are updated to use > VFPCC_REGNUM because few MVE intrinsics > set/get carry bit of FPSCR register. > > Please refer to Arm reference manual [1] for more details. > [1] https://developer.arm.com/docs/ddi0553/latest > > Regression tested on target arm-none-eabi and armeb-none-eabi and > found no regressions. > > Ok for trunk? Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) Thanks, Kyrill > > Thanks, > Srinath > gcc/ChangeLog: > > 2020-03-06 Andre Vieira > Mihail Ionescu > Srinath Parvathaneni > > * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When > vfp_base > feature bit is on and -mfpu=auto is passed as compiler option, > do not > generate error on not finding any match fpu. Because in this > case fpu > is not required. > * config/arm/arm-cpus.in (vfp_base): Define feature bit, this > bit is > enabled for MVE and also for all VFP extensions. > (VFPv2): Modify fgroup to enable vfp_base feature bit when > ever VFPv2 > is enabled. > (MVE): Define fgroup to enable feature bits mve, vfp_base and > armv7em. > (MVE_FP): Define fgroup to enable feature bits is fgroup MVE > and FPv5 > along with feature bits mve_float. > (mve): Modify add options in armv8.1-m.main arch for MVE. > (mve.fp): Modify add options in armv8.1-m.main arch for MVE with > floating point. > * config/arm/arm.c (use_return_insn): Replace the > check with TARGET_VFP_BASE. > (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE. > (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with TARGET_VFP_BASE, to allow cost calculations for copies in > MVE as > well. > (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE, to allow space calculation for VFP registers > in MVE > as well. > (arm_compute_frame_layout): Likewise. > (arm_save_coproc_regs): Likewise. > (arm_fixed_condition_code_regs): Modify to enable using > VFPCC_REGNUM > in MVE as well. > (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with equivalent macro TARGET_VFP_BASE. > (arm_expand_epilogue_apcs_frame): Likewise. > (arm_expand_epilogue): Likewise. > (arm_conditional_register_usage): Likewise. > (arm_declare_function_name): Add check to skip printing .fpu > directive > in assembly file when TARGET_VFP_BASE is enabled and > fpu_to_print is > "softvfp". > * config/arm/arm.h (TARGET_VFP_BASE): Define. > * config/arm/arm.md (arch): Add "mve" to arch. > (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true. > (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT > || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. > * config/arm/constraints.md
[PATCH v2][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534358.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vabavq_s8, vabavq_s16, vabavq_s32, vbicq_m_n_s16, vbicq_m_n_s32, vbicq_m_n_u16, vbicq_m_n_u32, vcmpeqq_m_f16, vcmpeqq_m_f32, vcvtaq_m_s16_f16, vcvtaq_m_u16_f16, vcvtaq_m_s32_f32, vcvtaq_m_u32_f32, vcvtq_m_f16_s16, vcvtq_m_f16_u16, vcvtq_m_f32_s32, vcvtq_m_f32_u32, vqrshrnbq_n_s16, vqrshrnbq_n_u16, vqrshrnbq_n_s32, vqrshrnbq_n_u32, vqrshrunbq_n_s16, vqrshrunbq_n_s32, vrmlaldavhaq_s32, vrmlaldavhaq_u32, vshlcq_s8, vshlcq_u8, vshlcq_s16, vshlcq_u16, vshlcq_s32, vshlcq_u32, vabavq_s8, vabavq_s16, vabavq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for ternary operands. (TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vabavq_s8): Define macro. (vabavq_s16): Likewise. (vabavq_s32): Likewise. (vbicq_m_n_s16): Likewise. (vbicq_m_n_s32): Likewise. (vbicq_m_n_u16): Likewise. (vbicq_m_n_u32): Likewise. (vcmpeqq_m_f16): Likewise. (vcmpeqq_m_f32): Likewise. (vcvtaq_m_s16_f16): Likewise. (vcvtaq_m_u16_f16): Likewise. (vcvtaq_m_s32_f32): Likewise. (vcvtaq_m_u32_f32): Likewise. (vcvtq_m_f16_s16): Likewise. (vcvtq_m_f16_u16): Likewise. (vcvtq_m_f32_s32): Likewise. (vcvtq_m_f32_u32): Likewise. (vqrshrnbq_n_s16): Likewise. (vqrshrnbq_n_u16): Likewise. (vqrshrnbq_n_s32): Likewise. (vqrshrnbq_n_u32): Likewise. (vqrshrunbq_n_s16): Likewise. (vqrshrunbq_n_s32): Likewise. (vrmlaldavhaq_s32): Likewise. (vrmlaldavhaq_u32): Likewise. (vshlcq_s8): Likewise. (vshlcq_u8): Likewise. (vshlcq_s16): Likewise. (vshlcq_u16): Likewise. (vshlcq_s32): Likewise. (vshlcq_u32): Likewise. (vabavq_u8): Likewise. (vabavq_u16): Likewise. (vabavq_u32): Likewise. (__arm_vabavq_s8): Define intrinsic. (__arm_vabavq_s16): Likewise. (__arm_vabavq_s32): Likewise. (__arm_vabavq_u8): Likewise. (__arm_vabavq_u16): Likewise. (__arm_vabavq_u32): Likewise. (__arm_vbicq_m_n_s16): Likewise. (__arm_vbicq_m_n_s32): Likewise. (__arm_vbicq_m_n_u16): Likewise. (__arm_vbicq_m_n_u32): Likewise. (__arm_vqrshrnbq_n_s16): Likewise. (__arm_vqrshrnbq_n_u16): Likewise. (__arm_vqrshrnbq_n_s32): Likewise. (__arm_vqrshrnbq_n_u32): Likewise. (__arm_vqrshrunbq_n_s16): Likewise. (__arm_vqrshrunbq_n_s32): Likewise. (__arm_vrmlaldavhaq_s32): Likewise. (__arm_vrmlaldavhaq_u32): Likewise. (__arm_vshlcq_s8): Likewise. (__arm_vshlcq_u8): Likewise. (__arm_vshlcq_s16): Likewise. (__arm_vshlcq_u16): Likewise. (__arm_vshlcq_s32): Likewise. (__arm_vshlcq_u32): Likewise. (__arm_vcmpeqq_m_f16): Likewise. (__arm_vcmpeqq_m_f32): Likewise. (__arm_vcvtaq_m_s16_f16): Likewise. (__arm_vcvtaq_m_u16_f16): Likewise. (__arm_vcvtaq_m_s32_f32): Likewise. (__arm_vcvtaq_m_u32_f32): Likewise. (__arm_vcvtq_m_f16_s16): Likewise. (__arm_vcvtq_m_f16_u16): Likewise. (__arm_vcvtq_m_f32_s32): Likewise. (__arm_vcvtq_m_f32_u32): Likewise. (vcvtaq_m): Define polymorphic variant. (vcvtq_m): Likewise. (vabavq): Likewise. (vshlcq): Likewise. (vbicq_m_n): Likewise. (vqrshrnbq_n): Likewise. (vqrshrunbq_n): Likewise. * config/arm/arm_mve_builtins.def (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Use the builtin
[PATCH v2][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534355.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vpselq_u8, vpselq_s8, vrev64q_m_u8, vqrdmlashq_n_u8, vqrdmlahq_n_u8, vqdmlahq_n_u8, vmvnq_m_u8, vmlasq_n_u8, vmlaq_n_u8, vmladavq_p_u8, vmladavaq_u8, vminvq_p_u8, vmaxvq_p_u8, vdupq_m_n_u8, vcmpneq_m_u8, vcmpneq_m_n_u8, vcmphiq_m_u8, vcmphiq_m_n_u8, vcmpeqq_m_u8, vcmpeqq_m_n_u8, vcmpcsq_m_u8, vcmpcsq_m_n_u8, vclzq_m_u8, vaddvaq_p_u8, vsriq_n_u8, vsliq_n_u8, vshlq_m_r_u8, vrshlq_m_n_u8, vqshlq_m_r_u8, vqrshlq_m_n_u8, vminavq_p_s8, vminaq_m_s8, vmaxavq_p_s8, vmaxaq_m_s8, vcmpneq_m_s8, vcmpneq_m_n_s8, vcmpltq_m_s8, vcmpltq_m_n_s8, vcmpleq_m_s8, vcmpleq_m_n_s8, vcmpgtq_m_s8, vcmpgtq_m_n_s8, vcmpgeq_m_s8, vcmpgeq_m_n_s8, vcmpeqq_m_s8, vcmpeqq_m_n_s8, vshlq_m_r_s8, vrshlq_m_n_s8, vrev64q_m_s8, vqshlq_m_r_s8, vqrshlq_m_n_s8, vqnegq_m_s8, vqabsq_m_s8, vnegq_m_s8, vmvnq_m_s8, vmlsdavxq_p_s8, vmlsdavq_p_s8, vmladavxq_p_s8, vmladavq_p_s8, vminvq_p_s8, vmaxvq_p_s8, vdupq_m_n_s8, vclzq_m_s8, vclsq_m_s8, vaddvaq_p_s8, vabsq_m_s8, vqrdmlsdhxq_s8, vqrdmlsdhq_s8, vqrdmlashq_n_s8, vqrdmlahq_n_s8, vqrdmladhxq_s8, vqrdmladhq_s8, vqdmlsdhxq_s8, vqdmlsdhq_s8, vqdmlahq_n_s8, vqdmladhxq_s8, vqdmladhq_s8, vmlsdavaxq_s8, vmlsdavaq_s8, vmlasq_n_s8, vmlaq_n_s8, vmladavaxq_s8, vmladavaq_s8, vsriq_n_s8, vsliq_n_s8, vpselq_u16, vpselq_s16, vrev64q_m_u16, vqrdmlashq_n_u16, vqrdmlahq_n_u16, vqdmlahq_n_u16, vmvnq_m_u16, vmlasq_n_u16, vmlaq_n_u16, vmladavq_p_u16, vmladavaq_u16, vminvq_p_u16, vmaxvq_p_u16, vdupq_m_n_u16, vcmpneq_m_u16, vcmpneq_m_n_u16, vcmphiq_m_u16, vcmphiq_m_n_u16, vcmpeqq_m_u16, vcmpeqq_m_n_u16, vcmpcsq_m_u16, vcmpcsq_m_n_u16, vclzq_m_u16, vaddvaq_p_u16, vsriq_n_u16, vsliq_n_u16, vshlq_m_r_u16, vrshlq_m_n_u16, vqshlq_m_r_u16, vqrshlq_m_n_u16, vminavq_p_s16, vminaq_m_s16, vmaxavq_p_s16, vmaxaq_m_s16, vcmpneq_m_s16, vcmpneq_m_n_s16, vcmpltq_m_s16, vcmpltq_m_n_s16, vcmpleq_m_s16, vcmpleq_m_n_s16, vcmpgtq_m_s16, vcmpgtq_m_n_s16, vcmpgeq_m_s16, vcmpgeq_m_n_s16, vcmpeqq_m_s16, vcmpeqq_m_n_s16, vshlq_m_r_s16, vrshlq_m_n_s16, vrev64q_m_s16, vqshlq_m_r_s16, vqrshlq_m_n_s16, vqnegq_m_s16, vqabsq_m_s16, vnegq_m_s16, vmvnq_m_s16, vmlsdavxq_p_s16, vmlsdavq_p_s16, vmladavxq_p_s16, vmladavq_p_s16, vminvq_p_s16, vmaxvq_p_s16, vdupq_m_n_s16, vclzq_m_s16, vclsq_m_s16, vaddvaq_p_s16, vabsq_m_s16, vqrdmlsdhxq_s16, vqrdmlsdhq_s16, vqrdmlashq_n_s16, vqrdmlahq_n_s16, vqrdmladhxq_s16, vqrdmladhq_s16, vqdmlsdhxq_s16, vqdmlsdhq_s16, vqdmlahq_n_s16, vqdmladhxq_s16, vqdmladhq_s16, vmlsdavaxq_s16, vmlsdavaq_s16, vmlasq_n_s16, vmlaq_n_s16, vmladavaxq_s16, vmladavaq_s16, vsriq_n_s16, vsliq_n_s16, vpselq_u32, vpselq_s32, vrev64q_m_u32, vqrdmlashq_n_u32, vqrdmlahq_n_u32, vqdmlahq_n_u32, vmvnq_m_u32, vmlasq_n_u32, vmlaq_n_u32, vmladavq_p_u32, vmladavaq_u32, vminvq_p_u32, vmaxvq_p_u32, vdupq_m_n_u32, vcmpneq_m_u32, vcmpneq_m_n_u32, vcmphiq_m_u32, vcmphiq_m_n_u32, vcmpeqq_m_u32, vcmpeqq_m_n_u32, vcmpcsq_m_u32, vcmpcsq_m_n_u32, vclzq_m_u32, vaddvaq_p_u32, vsriq_n_u32, vsliq_n_u32, vshlq_m_r_u32, vrshlq_m_n_u32, vqshlq_m_r_u32, vqrshlq_m_n_u32, vminavq_p_s32, vminaq_m_s32, vmaxavq_p_s32, vmaxaq_m_s32, vcmpneq_m_s32, vcmpneq_m_n_s32, vcmpltq_m_s32, vcmpltq_m_n_s32, vcmpleq_m_s32, vcmpleq_m_n_s32, vcmpgtq_m_s32, vcmpgtq_m_n_s32, vcmpgeq_m_s32, vcmpgeq_m_n_s32, vcmpeqq_m_s32, vcmpeqq_m_n_s32, vshlq_m_r_s32, vrshlq_m_n_s32, vrev64q_m_s32, vqshlq_m_r_s32, vqrshlq_m_n_s32, vqnegq_m_s32, vqabsq_m_s32, vnegq_m_s32, vmvnq_m_s32, vmlsdavxq_p_s32, vmlsdavq_p_s32, vmladavxq_p_s32, vmladavq_p_s32, vminvq_p_s32, vmaxvq_p_s32, vdupq_m_n_s32, vclzq_m_s32, vclsq_m_s32, vaddvaq_p_s32, vabsq_m_s32, vqrdmlsdhxq_s32, vqrdmlsdhq_s32, vqrdmlashq_n_s32, vqrdmlahq_n_s32, vqrdmladhxq_s32, vqrdmladhq_s32, vqdmlsdhxq_s32, vqdmlsdhq_s32, vqdmlahq_n_s32, vqdmladhxq_s32, vqdmladhq_s32, vmlsdavaxq_s32, vmlsdavaq_s32, vmlasq_n_s32, vmlaq_n_s32, vmladavaxq_s32, vmladavaq_s32, vsriq_n_s32, vsliq_n_s32, vpselq_u64, vpselq_s64. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rc" and "Re" are added, which checks the constant is with in the range of 0 to 15 and 0 to 31 respectively. Also a new predicates "mve_imm_15" and "mve_imm_31" are added, to check the the matching constraint Rc and Re respectively. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-25 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vpselq_u8): Define macro. (vpselq_s8): Likewise. (vrev64q_m_u8): Likewise. (vqrdmlashq_n_u8): Likewise. (vqrdmlahq_n_u8): Likewise. (vqdmlahq_n_u8): Likewise. (vmvnq_m_u8): L
[PATCH v2][ARM][GCC][3/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534322.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vrmlaldavhaxq_s32, vrmlsldavhaq_s32, vrmlsldavhaxq_s32, vaddlvaq_p_s32, vcvtbq_m_f16_f32, vcvtbq_m_f32_f16, vcvttq_m_f16_f32, vcvttq_m_f32_f16, vrev16q_m_s8, vrev32q_m_f16, vrmlaldavhq_p_s32, vrmlaldavhxq_p_s32, vrmlsldavhq_p_s32, vrmlsldavhxq_p_s32, vaddlvaq_p_u32, vrev16q_m_u8, vrmlaldavhq_p_u32, vmvnq_m_n_s16, vorrq_m_n_s16, vqrshrntq_n_s16, vqshrnbq_n_s16, vqshrntq_n_s16, vrshrnbq_n_s16, vrshrntq_n_s16, vshrnbq_n_s16, vshrntq_n_s16, vcmlaq_f16, vcmlaq_rot180_f16, vcmlaq_rot270_f16, vcmlaq_rot90_f16, vfmaq_f16, vfmaq_n_f16, vfmasq_n_f16, vfmsq_f16, vmlaldavaq_s16, vmlaldavaxq_s16, vmlsldavaq_s16, vmlsldavaxq_s16, vabsq_m_f16, vcvtmq_m_s16_f16, vcvtnq_m_s16_f16, vcvtpq_m_s16_f16, vcvtq_m_s16_f16, vdupq_m_n_f16, vmaxnmaq_m_f16, vmaxnmavq_p_f16, vmaxnmvq_p_f16, vminnmaq_m_f16, vminnmavq_p_f16, vminnmvq_p_f16, vmlaldavq_p_s16, vmlaldavxq_p_s16, vmlsldavq_p_s16, vmlsldavxq_p_s16, vmovlbq_m_s8, vmovltq_m_s8, vmovnbq_m_s16, vmovntq_m_s16, vnegq_m_f16, vpselq_f16, vqmovnbq_m_s16, vqmovntq_m_s16, vrev32q_m_s8, vrev64q_m_f16, vrndaq_m_f16, vrndmq_m_f16, vrndnq_m_f16, vrndpq_m_f16, vrndq_m_f16, vrndxq_m_f16, vcmpeqq_m_n_f16, vcmpgeq_m_f16, vcmpgeq_m_n_f16, vcmpgtq_m_f16, vcmpgtq_m_n_f16, vcmpleq_m_f16, vcmpleq_m_n_f16, vcmpltq_m_f16, vcmpltq_m_n_f16, vcmpneq_m_f16, vcmpneq_m_n_f16, vmvnq_m_n_u16, vorrq_m_n_u16, vqrshruntq_n_s16, vqshrunbq_n_s16, vqshruntq_n_s16, vcvtmq_m_u16_f16, vcvtnq_m_u16_f16, vcvtpq_m_u16_f16, vcvtq_m_u16_f16, vqmovunbq_m_s16, vqmovuntq_m_s16, vqrshrntq_n_u16, vqshrnbq_n_u16, vqshrntq_n_u16, vrshrnbq_n_u16, vrshrntq_n_u16, vshrnbq_n_u16, vshrntq_n_u16, vmlaldavaq_u16, vmlaldavaxq_u16, vmlaldavq_p_u16, vmlaldavxq_p_u16, vmovlbq_m_u8, vmovltq_m_u8, vmovnbq_m_u16, vmovntq_m_u16, vqmovnbq_m_u16, vqmovntq_m_u16, vrev32q_m_u8, vmvnq_m_n_s32, vorrq_m_n_s32, vqrshrntq_n_s32, vqshrnbq_n_s32, vqshrntq_n_s32, vrshrnbq_n_s32, vrshrntq_n_s32, vshrnbq_n_s32, vshrntq_n_s32, vcmlaq_f32, vcmlaq_rot180_f32, vcmlaq_rot270_f32, vcmlaq_rot90_f32, vfmaq_f32, vfmaq_n_f32, vfmasq_n_f32, vfmsq_f32, vmlaldavaq_s32, vmlaldavaxq_s32, vmlsldavaq_s32, vmlsldavaxq_s32, vabsq_m_f32, vcvtmq_m_s32_f32, vcvtnq_m_s32_f32, vcvtpq_m_s32_f32, vcvtq_m_s32_f32, vdupq_m_n_f32, vmaxnmaq_m_f32, vmaxnmavq_p_f32, vmaxnmvq_p_f32, vminnmaq_m_f32, vminnmavq_p_f32, vminnmvq_p_f32, vmlaldavq_p_s32, vmlaldavxq_p_s32, vmlsldavq_p_s32, vmlsldavxq_p_s32, vmovlbq_m_s16, vmovltq_m_s16, vmovnbq_m_s32, vmovntq_m_s32, vnegq_m_f32, vpselq_f32, vqmovnbq_m_s32, vqmovntq_m_s32, vrev32q_m_s16, vrev64q_m_f32, vrndaq_m_f32, vrndmq_m_f32, vrndnq_m_f32, vrndpq_m_f32, vrndq_m_f32, vrndxq_m_f32, vcmpeqq_m_n_f32, vcmpgeq_m_f32, vcmpgeq_m_n_f32, vcmpgtq_m_f32, vcmpgtq_m_n_f32, vcmpleq_m_f32, vcmpleq_m_n_f32, vcmpltq_m_f32, vcmpltq_m_n_f32, vcmpneq_m_f32, vcmpneq_m_n_f32, vmvnq_m_n_u32, vorrq_m_n_u32, vqrshruntq_n_s32, vqshrunbq_n_s32, vqshruntq_n_s32, vcvtmq_m_u32_f32, vcvtnq_m_u32_f32, vcvtpq_m_u32_f32, vcvtq_m_u32_f32, vqmovunbq_m_s32, vqmovuntq_m_s32, vqrshrntq_n_u32, vqshrnbq_n_u32, vqshrntq_n_u32, vrshrnbq_n_u32, vrshrntq_n_u32, vshrnbq_n_u32, vshrntq_n_u32, vmlaldavaq_u32, vmlaldavaxq_u32, vmlaldavq_p_u32, vmlaldavxq_p_u32, vmovlbq_m_u16, vmovltq_m_u16, vmovnbq_m_u32, vmovntq_m_u32, vqmovnbq_m_u32, vqmovntq_m_u32, vrev32q_m_u16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-29 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vrmlaldavhaxq_s32): Define macro. (vrmlsldavhaq_s32): Likewise. (vrmlsldavhaxq_s32): Likewise. (vaddlvaq_p_s32): Likewise. (vcvtbq_m_f16_f32): Likewise. (vcvtbq_m_f32_f16): Likewise. (vcvttq_m_f16_f32): Likewise. (vcvttq_m_f32_f16): Likewise. (vrev16q_m_s8): Likewise. (vrev32q_m_f16): Likewise. (vrmlaldavhq_p_s32): Likewise. (vrmlaldavhxq_p_s32): Likewise. (vrmlsldavhq_p_s32): Likewise. (vrmlsldavhxq_p_s32): Likewise. (vaddlvaq_p_u32): Likewise. (vrev16q_m_u8): Likewise. (vrmlaldavhq_p_u32): Likewise. (vmvnq_m_n_s16): Likewise. (vorrq_m_n_s16): Likewise. (vqrshrntq_n_s16): Likewise. (vqshrnbq_n_s16): Likewise. (vqshrntq_n_s16): Likewise. (vrshrnbq_n_s16): Likewise. (vrshrntq_n_s16): Likewise. (vshrnbq_n_s16): Likewise. (vshrntq_n_s16): Likewise. (vcmlaq_f16): Likewise. (vcmlaq_rot180_f16): Likewise. (vcmlaq_rot270_f16): Likewise
Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hi Kyrill, I have re-based this patch, please commit the following patch on my behalf. https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541826.html Regards, SRI. From: Gcc-patches on behalf of Srinath Parvathaneni Sent: 16 March 2020 10:54 To: Kyrill Tkachov ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. Hi Kyrill, > Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues. There is problem with git commit rights, could you commit this patch on my behalf. Regards SRI. From: Kyrill Tkachov Sent: 12 March 2020 11:16 To: Srinath Parvathaneni ; gcc-patches@gcc.gnu.org Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. Hi Srinath, On 3/10/20 6:19 PM, Srinath Parvathaneni wrote: > Hello Kyrill, > > This patch addresses all the comments in patch version v2. > (version v2) > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html > > > > > Hello, > > This patch is part of MVE ACLE intrinsics framework. > This patches add support to update (read/write) the APSR (Application > Program Status Register) > register and FPSCR (Floating-point Status and Control Register) > register for MVE. > This patch also enables thumb2 mov RTL patterns for MVE. > > A new feature bit vfp_base is added. This bit is enabled for all VFP, > MVE and MVE with floating point > extensions. This bit is used to enable the macro TARGET_VFP_BASE. For > all the VFP instructions, RTL patterns, > status and control registers are guarded by TARGET_HAVE_FLOAT. But > this patch modifies that and the > common instructions, RTL patterns, status and control registers > bewteen MVE and VFP are guarded by > TARGET_VFP_BASE macro. > > The RTL pattern set_fpscr and get_fpscr are updated to use > VFPCC_REGNUM because few MVE intrinsics > set/get carry bit of FPSCR register. > > Please refer to Arm reference manual [1] for more details. > [1] https://developer.arm.com/docs/ddi0553/latest > > Regression tested on target arm-none-eabi and armeb-none-eabi and > found no regressions. > > Ok for trunk? Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) Thanks, Kyrill > > Thanks, > Srinath > gcc/ChangeLog: > > 2020-03-06 Andre Vieira > Mihail Ionescu > Srinath Parvathaneni > > * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When > vfp_base > feature bit is on and -mfpu=auto is passed as compiler option, > do not > generate error on not finding any match fpu. Because in this > case fpu > is not required. > * config/arm/arm-cpus.in (vfp_base): Define feature bit, this > bit is > enabled for MVE and also for all VFP extensions. > (VFPv2): Modify fgroup to enable vfp_base feature bit when > ever VFPv2 > is enabled. > (MVE): Define fgroup to enable feature bits mve, vfp_base and > armv7em. > (MVE_FP): Define fgroup to enable feature bits is fgroup MVE > and FPv5 > along with feature bits mve_float. > (mve): Modify add options in armv8.1-m.main arch for MVE. > (mve.fp): Modify add options in armv8.1-m.main arch for MVE with > floating point. > * config/arm/arm.c (use_return_insn): Replace the > check with TARGET_VFP_BASE. > (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE. > (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with TARGET_VFP_BASE, to allow cost calculations for copies in > MVE as > well. > (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE, to allow space calculation for VFP registers > in MVE > as well. > (arm_compute_frame_layout): Likewise. > (arm_save_coproc_regs): Likewise. > (arm_fixed_condition_code_regs): Modify to enable using > VFPCC_REGNUM > in MVE as well. > (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with equivalent macro TARGET_VFP_BASE. > (arm_expand_epilogue_apcs_frame): Likewise. > (arm_expand_epilogue): Likewise. > (arm_conditional_register_usage): Likewise. > (arm_declare_function_name): Add check to skip printing .fpu > directive > in assembly file when TARGET_VFP_BASE is enabled and > fpu_to_print is > "softvfp&q
[PATCH v4][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.
Hello Kyrill, > What is the point of repeating the scan-assembler directives here? The > first scan-assembler should be redundant given the scan-assembler-times ? > Otherwise ok. In this patch testcases are modified by removing the redundant scan-assembler directive. Please commit to trunk on my behalf. Regards, SRI. Hello, This patch is part of MVE ACLE intrinsics framework. The patch supports the use of emulation for the single-precision arithmetic operations for MVE. This changes are to support the MVE ACLE intrinsics which operates on vector floating point arithmetic operations. Please refer to Arm reference manual [1] for more details. [1] https://developer.arm.com/docs/ddi0553/latest Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-06 Andre Vieira Srinath Parvathaneni * config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add emulator calls for dobule precision arithmetic operations for MVE. 2020-03-06 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/mve_libcall1.c: New test. * gcc.target/arm/mve/intrinsics/mve_libcall2.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index b40904a40e0979af4285fdbd85bfae55abea25dd..b3dfa285f016268714002b0217964c81e0a4840e 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5754,9 +5754,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) /* Values from double-precision helper functions are returned in core registers if the selected core only supports single-precision arithmetic, even if we are using the hard-float ABI. The same is -true for single-precision helpers, but we will never be using the -hard-float ABI on a CPU which doesn't support single-precision -operations in hardware. */ +true for single-precision helpers except in case of MVE, because in +MVE we will be using the hard-float ABI on a CPU which doesn't support +single-precision operations in hardware. In MVE the following check +enables use of emulation for the single-precision arithmetic +operations. */ + if (TARGET_HAVE_MVE) + { + add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode)); + } add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode)); diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c new file mode 100644 index ..7c38d3102d26d8d7f6358258018a993df8298b4d --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_libcall1.c @@ -0,0 +1,67 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb -mfpu=auto" } */ + +float +foo (float a, float b, float c) +{ + return a + b + c; +} + +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fadd" 2 } } */ + +float +foo1 (float a, float b, float c) +{ + return a - b - c; +} + +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fsub" 2 } } */ + +float +foo2 (float a, float b, float c) +{ + return a * b * c; +} + +/* { dg-final { scan-assembler-times "bl\\t__aeabi_fmul" 2 } } */ + +float +foo3 (float b, float c) +{ + return b / c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fdiv" } } */ + +int +foo4 (float b, float c) +{ + return b < c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmplt" } } */ + +int +foo5 (float b, float c) +{ + return b > c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpgt" } } */ + +int +foo6 (float b, float c) +{ + return b != c; +} + +/* { dg-final { scan-assembler "bl\\t__aeabi_fcmpeq" } } */ + +int +foo7 (float b, float c) +{ + return b == c; +} + +/* { dg-
[PATCH v3][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542068.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vabavq_s8, vabavq_s16, vabavq_s32, vbicq_m_n_s16, vbicq_m_n_s32, vbicq_m_n_u16, vbicq_m_n_u32, vcmpeqq_m_f16, vcmpeqq_m_f32, vcvtaq_m_s16_f16, vcvtaq_m_u16_f16, vcvtaq_m_s32_f32, vcvtaq_m_u32_f32, vcvtq_m_f16_s16, vcvtq_m_f16_u16, vcvtq_m_f32_s32, vcvtq_m_f32_u32, vqrshrnbq_n_s16, vqrshrnbq_n_u16, vqrshrnbq_n_s32, vqrshrnbq_n_u32, vqrshrunbq_n_s16, vqrshrunbq_n_s32, vrmlaldavhaq_s32, vrmlaldavhaq_u32, vshlcq_s8, vshlcq_u8, vshlcq_s16, vshlcq_u16, vshlcq_s32, vshlcq_u32, vabavq_s8, vabavq_s16, vabavq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for ternary operands. (TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vabavq_s8): Define macro. (vabavq_s16): Likewise. (vabavq_s32): Likewise. (vbicq_m_n_s16): Likewise. (vbicq_m_n_s32): Likewise. (vbicq_m_n_u16): Likewise. (vbicq_m_n_u32): Likewise. (vcmpeqq_m_f16): Likewise. (vcmpeqq_m_f32): Likewise. (vcvtaq_m_s16_f16): Likewise. (vcvtaq_m_u16_f16): Likewise. (vcvtaq_m_s32_f32): Likewise. (vcvtaq_m_u32_f32): Likewise. (vcvtq_m_f16_s16): Likewise. (vcvtq_m_f16_u16): Likewise. (vcvtq_m_f32_s32): Likewise. (vcvtq_m_f32_u32): Likewise. (vqrshrnbq_n_s16): Likewise. (vqrshrnbq_n_u16): Likewise. (vqrshrnbq_n_s32): Likewise. (vqrshrnbq_n_u32): Likewise. (vqrshrunbq_n_s16): Likewise. (vqrshrunbq_n_s32): Likewise. (vrmlaldavhaq_s32): Likewise. (vrmlaldavhaq_u32): Likewise. (vshlcq_s8): Likewise. (vshlcq_u8): Likewise. (vshlcq_s16): Likewise. (vshlcq_u16): Likewise. (vshlcq_s32): Likewise. (vshlcq_u32): Likewise. (vabavq_u8): Likewise. (vabavq_u16): Likewise. (vabavq_u32): Likewise. (__arm_vabavq_s8): Define intrinsic. (__arm_vabavq_s16): Likewise. (__arm_vabavq_s32): Likewise. (__arm_vabavq_u8): Likewise. (__arm_vabavq_u16): Likewise. (__arm_vabavq_u32): Likewise. (__arm_vbicq_m_n_s16): Likewise. (__arm_vbicq_m_n_s32): Likewise. (__arm_vbicq_m_n_u16): Likewise. (__arm_vbicq_m_n_u32): Likewise. (__arm_vqrshrnbq_n_s16): Likewise. (__arm_vqrshrnbq_n_u16): Likewise. (__arm_vqrshrnbq_n_s32): Likewise. (__arm_vqrshrnbq_n_u32): Likewise. (__arm_vqrshrunbq_n_s16): Likewise. (__arm_vqrshrunbq_n_s32): Likewise. (__arm_vrmlaldavhaq_s32): Likewise. (__arm_vrmlaldavhaq_u32): Likewise. (__arm_vshlcq_s8): Likewise. (__arm_vshlcq_u8): Likewise. (__arm_vshlcq_s16): Likewise. (__arm_vshlcq_u16): Likewise. (__arm_vshlcq_s32): Likewise. (__arm_vshlcq_u32): Likewise. (__arm_vcmpeqq_m_f16): Likewise. (__arm_vcmpeqq_m_f32): Likewise. (__arm_vcvtaq_m_s16_f16): Likewise. (__arm_vcvtaq_m_u16_f16): Likewise. (__arm_vcvtaq_m_s32_f32): Likewise. (__arm_vcvtaq_m_u32_f32): Likewise. (__arm_vcvtq_m_f16_s16): Likewise. (__arm_vcvtq_m_f16_u16): Likewise. (__arm_vcvtq_m_f32_s32): Likewise. (__arm_vcvtq_m_f32_u32): Likewise. (vcvtaq_m): Define polymorphic variant. (vcvtq_m): Likewise. (vabavq): Likewise. (vshlcq): Likewise. (vbicq_m_n): Likewise. (vqrshrnbq_n): Likewise. (vqrshrunbq_n): Likewise. * config/arm/arm_mve_builtins.def (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Use the builtin qualifer
[PATCH v3][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542067.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vpselq_u8, vpselq_s8, vrev64q_m_u8, vqrdmlashq_n_u8, vqrdmlahq_n_u8, vqdmlahq_n_u8, vmvnq_m_u8, vmlasq_n_u8, vmlaq_n_u8, vmladavq_p_u8, vmladavaq_u8, vminvq_p_u8, vmaxvq_p_u8, vdupq_m_n_u8, vcmpneq_m_u8, vcmpneq_m_n_u8, vcmphiq_m_u8, vcmphiq_m_n_u8, vcmpeqq_m_u8, vcmpeqq_m_n_u8, vcmpcsq_m_u8, vcmpcsq_m_n_u8, vclzq_m_u8, vaddvaq_p_u8, vsriq_n_u8, vsliq_n_u8, vshlq_m_r_u8, vrshlq_m_n_u8, vqshlq_m_r_u8, vqrshlq_m_n_u8, vminavq_p_s8, vminaq_m_s8, vmaxavq_p_s8, vmaxaq_m_s8, vcmpneq_m_s8, vcmpneq_m_n_s8, vcmpltq_m_s8, vcmpltq_m_n_s8, vcmpleq_m_s8, vcmpleq_m_n_s8, vcmpgtq_m_s8, vcmpgtq_m_n_s8, vcmpgeq_m_s8, vcmpgeq_m_n_s8, vcmpeqq_m_s8, vcmpeqq_m_n_s8, vshlq_m_r_s8, vrshlq_m_n_s8, vrev64q_m_s8, vqshlq_m_r_s8, vqrshlq_m_n_s8, vqnegq_m_s8, vqabsq_m_s8, vnegq_m_s8, vmvnq_m_s8, vmlsdavxq_p_s8, vmlsdavq_p_s8, vmladavxq_p_s8, vmladavq_p_s8, vminvq_p_s8, vmaxvq_p_s8, vdupq_m_n_s8, vclzq_m_s8, vclsq_m_s8, vaddvaq_p_s8, vabsq_m_s8, vqrdmlsdhxq_s8, vqrdmlsdhq_s8, vqrdmlashq_n_s8, vqrdmlahq_n_s8, vqrdmladhxq_s8, vqrdmladhq_s8, vqdmlsdhxq_s8, vqdmlsdhq_s8, vqdmlahq_n_s8, vqdmladhxq_s8, vqdmladhq_s8, vmlsdavaxq_s8, vmlsdavaq_s8, vmlasq_n_s8, vmlaq_n_s8, vmladavaxq_s8, vmladavaq_s8, vsriq_n_s8, vsliq_n_s8, vpselq_u16, vpselq_s16, vrev64q_m_u16, vqrdmlashq_n_u16, vqrdmlahq_n_u16, vqdmlahq_n_u16, vmvnq_m_u16, vmlasq_n_u16, vmlaq_n_u16, vmladavq_p_u16, vmladavaq_u16, vminvq_p_u16, vmaxvq_p_u16, vdupq_m_n_u16, vcmpneq_m_u16, vcmpneq_m_n_u16, vcmphiq_m_u16, vcmphiq_m_n_u16, vcmpeqq_m_u16, vcmpeqq_m_n_u16, vcmpcsq_m_u16, vcmpcsq_m_n_u16, vclzq_m_u16, vaddvaq_p_u16, vsriq_n_u16, vsliq_n_u16, vshlq_m_r_u16, vrshlq_m_n_u16, vqshlq_m_r_u16, vqrshlq_m_n_u16, vminavq_p_s16, vminaq_m_s16, vmaxavq_p_s16, vmaxaq_m_s16, vcmpneq_m_s16, vcmpneq_m_n_s16, vcmpltq_m_s16, vcmpltq_m_n_s16, vcmpleq_m_s16, vcmpleq_m_n_s16, vcmpgtq_m_s16, vcmpgtq_m_n_s16, vcmpgeq_m_s16, vcmpgeq_m_n_s16, vcmpeqq_m_s16, vcmpeqq_m_n_s16, vshlq_m_r_s16, vrshlq_m_n_s16, vrev64q_m_s16, vqshlq_m_r_s16, vqrshlq_m_n_s16, vqnegq_m_s16, vqabsq_m_s16, vnegq_m_s16, vmvnq_m_s16, vmlsdavxq_p_s16, vmlsdavq_p_s16, vmladavxq_p_s16, vmladavq_p_s16, vminvq_p_s16, vmaxvq_p_s16, vdupq_m_n_s16, vclzq_m_s16, vclsq_m_s16, vaddvaq_p_s16, vabsq_m_s16, vqrdmlsdhxq_s16, vqrdmlsdhq_s16, vqrdmlashq_n_s16, vqrdmlahq_n_s16, vqrdmladhxq_s16, vqrdmladhq_s16, vqdmlsdhxq_s16, vqdmlsdhq_s16, vqdmlahq_n_s16, vqdmladhxq_s16, vqdmladhq_s16, vmlsdavaxq_s16, vmlsdavaq_s16, vmlasq_n_s16, vmlaq_n_s16, vmladavaxq_s16, vmladavaq_s16, vsriq_n_s16, vsliq_n_s16, vpselq_u32, vpselq_s32, vrev64q_m_u32, vqrdmlashq_n_u32, vqrdmlahq_n_u32, vqdmlahq_n_u32, vmvnq_m_u32, vmlasq_n_u32, vmlaq_n_u32, vmladavq_p_u32, vmladavaq_u32, vminvq_p_u32, vmaxvq_p_u32, vdupq_m_n_u32, vcmpneq_m_u32, vcmpneq_m_n_u32, vcmphiq_m_u32, vcmphiq_m_n_u32, vcmpeqq_m_u32, vcmpeqq_m_n_u32, vcmpcsq_m_u32, vcmpcsq_m_n_u32, vclzq_m_u32, vaddvaq_p_u32, vsriq_n_u32, vsliq_n_u32, vshlq_m_r_u32, vrshlq_m_n_u32, vqshlq_m_r_u32, vqrshlq_m_n_u32, vminavq_p_s32, vminaq_m_s32, vmaxavq_p_s32, vmaxaq_m_s32, vcmpneq_m_s32, vcmpneq_m_n_s32, vcmpltq_m_s32, vcmpltq_m_n_s32, vcmpleq_m_s32, vcmpleq_m_n_s32, vcmpgtq_m_s32, vcmpgtq_m_n_s32, vcmpgeq_m_s32, vcmpgeq_m_n_s32, vcmpeqq_m_s32, vcmpeqq_m_n_s32, vshlq_m_r_s32, vrshlq_m_n_s32, vrev64q_m_s32, vqshlq_m_r_s32, vqrshlq_m_n_s32, vqnegq_m_s32, vqabsq_m_s32, vnegq_m_s32, vmvnq_m_s32, vmlsdavxq_p_s32, vmlsdavq_p_s32, vmladavxq_p_s32, vmladavq_p_s32, vminvq_p_s32, vmaxvq_p_s32, vdupq_m_n_s32, vclzq_m_s32, vclsq_m_s32, vaddvaq_p_s32, vabsq_m_s32, vqrdmlsdhxq_s32, vqrdmlsdhq_s32, vqrdmlashq_n_s32, vqrdmlahq_n_s32, vqrdmladhxq_s32, vqrdmladhq_s32, vqdmlsdhxq_s32, vqdmlsdhq_s32, vqdmlahq_n_s32, vqdmladhxq_s32, vqdmladhq_s32, vmlsdavaxq_s32, vmlsdavaq_s32, vmlasq_n_s32, vmlaq_n_s32, vmladavaxq_s32, vmladavaq_s32, vsriq_n_s32, vsliq_n_s32, vpselq_u64, vpselq_s64. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rc" and "Re" are added, which checks the constant is with in the range of 0 to 15 and 0 to 31 respectively. Also a new predicates "mve_imm_15" and "mve_imm_31" are added, to check the the matching constraint Rc and Re respectively. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-25 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vpselq_u8): Define macro. (vpselq_s8): Likewise. (vrev64q_m_u8): Likewise. (vqrdmlashq_n_u8): Likewise. (vqrdmlahq_n_u8): Likewise. (vqdmlahq_n_u8): Likewise. (vmvnq_m_u8): L
[PATCH v3][ARM][GCC][3/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542068.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vrmlaldavhaxq_s32, vrmlsldavhaq_s32, vrmlsldavhaxq_s32, vaddlvaq_p_s32, vcvtbq_m_f16_f32, vcvtbq_m_f32_f16, vcvttq_m_f16_f32, vcvttq_m_f32_f16, vrev16q_m_s8, vrev32q_m_f16, vrmlaldavhq_p_s32, vrmlaldavhxq_p_s32, vrmlsldavhq_p_s32, vrmlsldavhxq_p_s32, vaddlvaq_p_u32, vrev16q_m_u8, vrmlaldavhq_p_u32, vmvnq_m_n_s16, vorrq_m_n_s16, vqrshrntq_n_s16, vqshrnbq_n_s16, vqshrntq_n_s16, vrshrnbq_n_s16, vrshrntq_n_s16, vshrnbq_n_s16, vshrntq_n_s16, vcmlaq_f16, vcmlaq_rot180_f16, vcmlaq_rot270_f16, vcmlaq_rot90_f16, vfmaq_f16, vfmaq_n_f16, vfmasq_n_f16, vfmsq_f16, vmlaldavaq_s16, vmlaldavaxq_s16, vmlsldavaq_s16, vmlsldavaxq_s16, vabsq_m_f16, vcvtmq_m_s16_f16, vcvtnq_m_s16_f16, vcvtpq_m_s16_f16, vcvtq_m_s16_f16, vdupq_m_n_f16, vmaxnmaq_m_f16, vmaxnmavq_p_f16, vmaxnmvq_p_f16, vminnmaq_m_f16, vminnmavq_p_f16, vminnmvq_p_f16, vmlaldavq_p_s16, vmlaldavxq_p_s16, vmlsldavq_p_s16, vmlsldavxq_p_s16, vmovlbq_m_s8, vmovltq_m_s8, vmovnbq_m_s16, vmovntq_m_s16, vnegq_m_f16, vpselq_f16, vqmovnbq_m_s16, vqmovntq_m_s16, vrev32q_m_s8, vrev64q_m_f16, vrndaq_m_f16, vrndmq_m_f16, vrndnq_m_f16, vrndpq_m_f16, vrndq_m_f16, vrndxq_m_f16, vcmpeqq_m_n_f16, vcmpgeq_m_f16, vcmpgeq_m_n_f16, vcmpgtq_m_f16, vcmpgtq_m_n_f16, vcmpleq_m_f16, vcmpleq_m_n_f16, vcmpltq_m_f16, vcmpltq_m_n_f16, vcmpneq_m_f16, vcmpneq_m_n_f16, vmvnq_m_n_u16, vorrq_m_n_u16, vqrshruntq_n_s16, vqshrunbq_n_s16, vqshruntq_n_s16, vcvtmq_m_u16_f16, vcvtnq_m_u16_f16, vcvtpq_m_u16_f16, vcvtq_m_u16_f16, vqmovunbq_m_s16, vqmovuntq_m_s16, vqrshrntq_n_u16, vqshrnbq_n_u16, vqshrntq_n_u16, vrshrnbq_n_u16, vrshrntq_n_u16, vshrnbq_n_u16, vshrntq_n_u16, vmlaldavaq_u16, vmlaldavaxq_u16, vmlaldavq_p_u16, vmlaldavxq_p_u16, vmovlbq_m_u8, vmovltq_m_u8, vmovnbq_m_u16, vmovntq_m_u16, vqmovnbq_m_u16, vqmovntq_m_u16, vrev32q_m_u8, vmvnq_m_n_s32, vorrq_m_n_s32, vqrshrntq_n_s32, vqshrnbq_n_s32, vqshrntq_n_s32, vrshrnbq_n_s32, vrshrntq_n_s32, vshrnbq_n_s32, vshrntq_n_s32, vcmlaq_f32, vcmlaq_rot180_f32, vcmlaq_rot270_f32, vcmlaq_rot90_f32, vfmaq_f32, vfmaq_n_f32, vfmasq_n_f32, vfmsq_f32, vmlaldavaq_s32, vmlaldavaxq_s32, vmlsldavaq_s32, vmlsldavaxq_s32, vabsq_m_f32, vcvtmq_m_s32_f32, vcvtnq_m_s32_f32, vcvtpq_m_s32_f32, vcvtq_m_s32_f32, vdupq_m_n_f32, vmaxnmaq_m_f32, vmaxnmavq_p_f32, vmaxnmvq_p_f32, vminnmaq_m_f32, vminnmavq_p_f32, vminnmvq_p_f32, vmlaldavq_p_s32, vmlaldavxq_p_s32, vmlsldavq_p_s32, vmlsldavxq_p_s32, vmovlbq_m_s16, vmovltq_m_s16, vmovnbq_m_s32, vmovntq_m_s32, vnegq_m_f32, vpselq_f32, vqmovnbq_m_s32, vqmovntq_m_s32, vrev32q_m_s16, vrev64q_m_f32, vrndaq_m_f32, vrndmq_m_f32, vrndnq_m_f32, vrndpq_m_f32, vrndq_m_f32, vrndxq_m_f32, vcmpeqq_m_n_f32, vcmpgeq_m_f32, vcmpgeq_m_n_f32, vcmpgtq_m_f32, vcmpgtq_m_n_f32, vcmpleq_m_f32, vcmpleq_m_n_f32, vcmpltq_m_f32, vcmpltq_m_n_f32, vcmpneq_m_f32, vcmpneq_m_n_f32, vmvnq_m_n_u32, vorrq_m_n_u32, vqrshruntq_n_s32, vqshrunbq_n_s32, vqshruntq_n_s32, vcvtmq_m_u32_f32, vcvtnq_m_u32_f32, vcvtpq_m_u32_f32, vcvtq_m_u32_f32, vqmovunbq_m_s32, vqmovuntq_m_s32, vqrshrntq_n_u32, vqshrnbq_n_u32, vqshrntq_n_u32, vrshrnbq_n_u32, vrshrntq_n_u32, vshrnbq_n_u32, vshrntq_n_u32, vmlaldavaq_u32, vmlaldavaxq_u32, vmlaldavq_p_u32, vmlaldavxq_p_u32, vmovlbq_m_u16, vmovltq_m_u16, vmovnbq_m_u32, vmovntq_m_u32, vqmovnbq_m_u32, vqmovntq_m_u32, vrev32q_m_u16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-29 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vrmlaldavhaxq_s32): Define macro. (vrmlsldavhaq_s32): Likewise. (vrmlsldavhaxq_s32): Likewise. (vaddlvaq_p_s32): Likewise. (vcvtbq_m_f16_f32): Likewise. (vcvtbq_m_f32_f16): Likewise. (vcvttq_m_f16_f32): Likewise. (vcvttq_m_f32_f16): Likewise. (vrev16q_m_s8): Likewise. (vrev32q_m_f16): Likewise. (vrmlaldavhq_p_s32): Likewise. (vrmlaldavhxq_p_s32): Likewise. (vrmlsldavhq_p_s32): Likewise. (vrmlsldavhxq_p_s32): Likewise. (vaddlvaq_p_u32): Likewise. (vrev16q_m_u8): Likewise. (vrmlaldavhq_p_u32): Likewise. (vmvnq_m_n_s16): Likewise. (vorrq_m_n_s16): Likewise. (vqrshrntq_n_s16): Likewise. (vqshrnbq_n_s16): Likewise. (vqshrntq_n_s16): Likewise. (vrshrnbq_n_s16): Likewise. (vrshrntq_n_s16): Likewise. (vshrnbq_n_s16): Likewise. (vshrntq_n_s16): Likewise. (vcmlaq_f16): Likewise. (vcmlaq_rot180_f16): Likewise. (vcmlaq_rot270_f16): Likewise
[PATCH v2][ARM][GCC][1/4x]: MVE intrinsics with quaternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534332.html Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vsriq_m_n_s8, vsubq_m_s8, vsubq_x_s8, vcvtq_m_n_f16_u16, vcvtq_x_n_f16_u16, vqshluq_m_n_s8, vabavq_p_s8, vsriq_m_n_u8, vshlq_m_u8, vshlq_x_u8, vsubq_m_u8, vsubq_x_u8, vabavq_p_u8, vshlq_m_s8, vshlq_x_s8, vcvtq_m_n_f16_s16, vcvtq_x_n_f16_s16, vsriq_m_n_s16, vsubq_m_s16, vsubq_x_s16, vcvtq_m_n_f32_u32, vcvtq_x_n_f32_u32, vqshluq_m_n_s16, vabavq_p_s16, vsriq_m_n_u16, vshlq_m_u16, vshlq_x_u16, vsubq_m_u16, vsubq_x_u16, vabavq_p_u16, vshlq_m_s16, vshlq_x_s16, vcvtq_m_n_f32_s32, vcvtq_x_n_f32_s32, vsriq_m_n_s32, vsubq_m_s32, vsubq_x_s32, vqshluq_m_n_s32, vabavq_p_s32, vsriq_m_n_u32, vshlq_m_u32, vshlq_x_u32, vsubq_m_u32, vsubq_x_u32, vabavq_p_u32, vshlq_m_s32, vshlq_x_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-29 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Define builtin qualifier. (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vsriq_m_n_s8): Define macro. (vsubq_m_s8): Likewise. (vcvtq_m_n_f16_u16): Likewise. (vqshluq_m_n_s8): Likewise. (vabavq_p_s8): Likewise. (vsriq_m_n_u8): Likewise. (vshlq_m_u8): Likewise. (vsubq_m_u8): Likewise. (vabavq_p_u8): Likewise. (vshlq_m_s8): Likewise. (vcvtq_m_n_f16_s16): Likewise. (vsriq_m_n_s16): Likewise. (vsubq_m_s16): Likewise. (vcvtq_m_n_f32_u32): Likewise. (vqshluq_m_n_s16): Likewise. (vabavq_p_s16): Likewise. (vsriq_m_n_u16): Likewise. (vshlq_m_u16): Likewise. (vsubq_m_u16): Likewise. (vabavq_p_u16): Likewise. (vshlq_m_s16): Likewise. (vcvtq_m_n_f32_s32): Likewise. (vsriq_m_n_s32): Likewise. (vsubq_m_s32): Likewise. (vqshluq_m_n_s32): Likewise. (vabavq_p_s32): Likewise. (vsriq_m_n_u32): Likewise. (vshlq_m_u32): Likewise. (vsubq_m_u32): Likewise. (vabavq_p_u32): Likewise. (vshlq_m_s32): Likewise. (__arm_vsriq_m_n_s8): Define intrinsic. (__arm_vsubq_m_s8): Likewise. (__arm_vqshluq_m_n_s8): Likewise. (__arm_vabavq_p_s8): Likewise. (__arm_vsriq_m_n_u8): Likewise. (__arm_vshlq_m_u8): Likewise. (__arm_vsubq_m_u8): Likewise. (__arm_vabavq_p_u8): Likewise. (__arm_vshlq_m_s8): Likewise. (__arm_vsriq_m_n_s16): Likewise. (__arm_vsubq_m_s16): Likewise. (__arm_vqshluq_m_n_s16): Likewise. (__arm_vabavq_p_s16): Likewise. (__arm_vsriq_m_n_u16): Likewise. (__arm_vshlq_m_u16): Likewise. (__arm_vsubq_m_u16): Likewise. (__arm_vabavq_p_u16): Likewise. (__arm_vshlq_m_s16): Likewise. (__arm_vsriq_m_n_s32): Likewise. (__arm_vsubq_m_s32): Likewise. (__arm_vqshluq_m_n_s32): Likewise. (__arm_vabavq_p_s32): Likewise. (__arm_vsriq_m_n_u32): Likewise. (__arm_vshlq_m_u32): Likewise. (__arm_vsubq_m_u32): Likewise. (__arm_vabavq_p_u32): Likewise. (__arm_vshlq_m_s32): Likewise. (__arm_vcvtq_m_n_f16_u16): Likewise. (__arm_vcvtq_m_n_f16_s16): Likewise. (__arm_vcvtq_m_n_f32_u32): Likewise. (__arm_vcvtq_m_n_f32_s32): Likewise. (vcvtq_m_n): Define polymorphic variant. (vqshluq_m_n): Likewise. (vshlq_m): Likewise. (vsriq_m_n): Likewise. (vsubq_m): Likewise. (vabavq_p): Likewise. * config/arm/arm_mve_builtins.def (QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Use builtin qualifier. (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise
[PATCH v2][ARM][GCC][3/4x]: MVE intrinsics with quaternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534324.html Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vmlaldavaq_p_s16, vmlaldavaq_p_s32, vmlaldavaq_p_u16, vmlaldavaq_p_u32, vmlaldavaxq_p_s16, vmlaldavaxq_p_s32, vmlaldavaxq_p_u16, vmlaldavaxq_p_u32, vmlsldavaq_p_s16, vmlsldavaq_p_s32, vmlsldavaxq_p_s16, vmlsldavaxq_p_s32, vmullbq_poly_m_p16, vmullbq_poly_m_p8, vmulltq_poly_m_p16, vmulltq_poly_m_p8, vqdmullbq_m_n_s16, vqdmullbq_m_n_s32, vqdmullbq_m_s16, vqdmullbq_m_s32, vqdmulltq_m_n_s16, vqdmulltq_m_n_s32, vqdmulltq_m_s16, vqdmulltq_m_s32, vqrshrnbq_m_n_s16, vqrshrnbq_m_n_s32, vqrshrnbq_m_n_u16, vqrshrnbq_m_n_u32, vqrshrntq_m_n_s16, vqrshrntq_m_n_s32, vqrshrntq_m_n_u16, vqrshrntq_m_n_u32, vqrshrunbq_m_n_s16, vqrshrunbq_m_n_s32, vqrshruntq_m_n_s16, vqrshruntq_m_n_s32, vqshrnbq_m_n_s16, vqshrnbq_m_n_s32, vqshrnbq_m_n_u16, vqshrnbq_m_n_u32, vqshrntq_m_n_s16, vqshrntq_m_n_s32, vqshrntq_m_n_u16, vqshrntq_m_n_u32, vqshrunbq_m_n_s16, vqshrunbq_m_n_s32, vqshruntq_m_n_s16, vqshruntq_m_n_s32, vrmlaldavhaq_p_s32, vrmlaldavhaq_p_u32, vrmlaldavhaxq_p_s32, vrmlsldavhaq_p_s32, vrmlsldavhaxq_p_s32, vrshrnbq_m_n_s16, vrshrnbq_m_n_s32, vrshrnbq_m_n_u16, vrshrnbq_m_n_u32, vrshrntq_m_n_s16, vrshrntq_m_n_s32, vrshrntq_m_n_u16, vrshrntq_m_n_u32, vshllbq_m_n_s16, vshllbq_m_n_s8, vshllbq_m_n_u16, vshllbq_m_n_u8, vshlltq_m_n_s16, vshlltq_m_n_s8, vshlltq_m_n_u16, vshlltq_m_n_u8, vshrnbq_m_n_s16, vshrnbq_m_n_s32, vshrnbq_m_n_u16, vshrnbq_m_n_u32, vshrntq_m_n_s16, vshrntq_m_n_s32, vshrntq_m_n_u16, vshrntq_m_n_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-31 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-protos.h (arm_mve_immediate_check): * config/arm/arm.c (arm_mve_immediate_check): Define fuction to check mode and interger value. * config/arm/arm_mve.h (vmlaldavaq_p_s32): Define macro. (vmlaldavaq_p_s16): Likewise. (vmlaldavaq_p_u32): Likewise. (vmlaldavaq_p_u16): Likewise. (vmlaldavaxq_p_s32): Likewise. (vmlaldavaxq_p_s16): Likewise. (vmlaldavaxq_p_u32): Likewise. (vmlaldavaxq_p_u16): Likewise. (vmlsldavaq_p_s32): Likewise. (vmlsldavaq_p_s16): Likewise. (vmlsldavaxq_p_s32): Likewise. (vmlsldavaxq_p_s16): Likewise. (vmullbq_poly_m_p8): Likewise. (vmullbq_poly_m_p16): Likewise. (vmulltq_poly_m_p8): Likewise. (vmulltq_poly_m_p16): Likewise. (vqdmullbq_m_n_s32): Likewise. (vqdmullbq_m_n_s16): Likewise. (vqdmullbq_m_s32): Likewise. (vqdmullbq_m_s16): Likewise. (vqdmulltq_m_n_s32): Likewise. (vqdmulltq_m_n_s16): Likewise. (vqdmulltq_m_s32): Likewise. (vqdmulltq_m_s16): Likewise. (vqrshrnbq_m_n_s32): Likewise. (vqrshrnbq_m_n_s16): Likewise. (vqrshrnbq_m_n_u32): Likewise. (vqrshrnbq_m_n_u16): Likewise. (vqrshrntq_m_n_s32): Likewise. (vqrshrntq_m_n_s16): Likewise. (vqrshrntq_m_n_u32): Likewise. (vqrshrntq_m_n_u16): Likewise. (vqrshrunbq_m_n_s32): Likewise. (vqrshrunbq_m_n_s16): Likewise. (vqrshruntq_m_n_s32): Likewise. (vqrshruntq_m_n_s16): Likewise. (vqshrnbq_m_n_s32): Likewise. (vqshrnbq_m_n_s16): Likewise. (vqshrnbq_m_n_u32): Likewise. (vqshrnbq_m_n_u16): Likewise. (vqshrntq_m_n_s32): Likewise. (vqshrntq_m_n_s16): Likewise. (vqshrntq_m_n_u32): Likewise. (vqshrntq_m_n_u16): Likewise. (vqshrunbq_m_n_s32): Likewise. (vqshrunbq_m_n_s16): Likewise. (vqshruntq_m_n_s32): Likewise. (vqshruntq_m_n_s16): Likewise. (vrmlaldavhaq_p_s32): Likewise. (vrmlaldavhaq_p_u32): Likewise. (vrmlaldavhaxq_p_s32): Likewise. (vrmlsldavhaq_p_s32): Likewise. (vrmlsldavhaxq_p_s32): Likewise. (vrshrnbq_m_n_s32): Likewise. (vrshrnbq_m_n_s16): Likewise. (vrshrnbq_m_n_u32): Likewise. (vrshrnbq_m_n_u16): Likewise. (vrshrntq_m_n_s32): Likewise. (vrshrntq_m_n_s16): Likewise. (vrshrntq_m_n_u32): Likewise. (vrshrntq_m_n_u16): Likewise. (vshllbq_m_n_s8): Likewise. (vshllbq_m_n_s16): Likewise. (vshllbq_m_n_u8): Likewise. (vshllbq_m_n_u16): Likewise. (vshlltq_m_n_s8): Likewise. (vshlltq_m_n_s16): Likewise. (vshlltq_m_n_u8): Likewise. (vshlltq_m_n_u16): Likewise. (vshrnbq_m_n_s32): Likewise. (vshrnbq_m_n_s16): Likewise. (vshrnbq_m_n_u32
[PATCH v2][ARM][GCC][2/4x]: MVE intrinsics with quaternary operands.
, vsubq_m_n_s32, vsubq_m_n_s16, vsubq_m_n_u8, vsubq_m_n_u32, vsubq_m_n_u16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-30 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vabdq_m_s8): Define macro. (vabdq_m_s32): Likewise. (vabdq_m_s16): Likewise. (vabdq_m_u8): Likewise. (vabdq_m_u32): Likewise. (vabdq_m_u16): Likewise. (vaddq_m_n_s8): Likewise. (vaddq_m_n_s32): Likewise. (vaddq_m_n_s16): Likewise. (vaddq_m_n_u8): Likewise. (vaddq_m_n_u32): Likewise. (vaddq_m_n_u16): Likewise. (vaddq_m_s8): Likewise. (vaddq_m_s32): Likewise. (vaddq_m_s16): Likewise. (vaddq_m_u8): Likewise. (vaddq_m_u32): Likewise. (vaddq_m_u16): Likewise. (vandq_m_s8): Likewise. (vandq_m_s32): Likewise. (vandq_m_s16): Likewise. (vandq_m_u8): Likewise. (vandq_m_u32): Likewise. (vandq_m_u16): Likewise. (vbicq_m_s8): Likewise. (vbicq_m_s32): Likewise. (vbicq_m_s16): Likewise. (vbicq_m_u8): Likewise. (vbicq_m_u32): Likewise. (vbicq_m_u16): Likewise. (vbrsrq_m_n_s8): Likewise. (vbrsrq_m_n_s32): Likewise. (vbrsrq_m_n_s16): Likewise. (vbrsrq_m_n_u8): Likewise. (vbrsrq_m_n_u32): Likewise. (vbrsrq_m_n_u16): Likewise. (vcaddq_rot270_m_s8): Likewise. (vcaddq_rot270_m_s32): Likewise. (vcaddq_rot270_m_s16): Likewise. (vcaddq_rot270_m_u8): Likewise. (vcaddq_rot270_m_u32): Likewise. (vcaddq_rot270_m_u16): Likewise. (vcaddq_rot90_m_s8): Likewise. (vcaddq_rot90_m_s32): Likewise. (vcaddq_rot90_m_s16): Likewise. (vcaddq_rot90_m_u8): Likewise. (vcaddq_rot90_m_u32): Likewise. (vcaddq_rot90_m_u16): Likewise. (veorq_m_s8): Likewise. (veorq_m_s32): Likewise. (veorq_m_s16): Likewise. (veorq_m_u8): Likewise. (veorq_m_u32): Likewise. (veorq_m_u16): Likewise. (vhaddq_m_n_s8): Likewise. (vhaddq_m_n_s32): Likewise. (vhaddq_m_n_s16): Likewise. (vhaddq_m_n_u8): Likewise. (vhaddq_m_n_u32): Likewise. (vhaddq_m_n_u16): Likewise. (vhaddq_m_s8): Likewise. (vhaddq_m_s32): Likewise. (vhaddq_m_s16): Likewise. (vhaddq_m_u8): Likewise. (vhaddq_m_u32): Likewise. (vhaddq_m_u16): Likewise. (vhcaddq_rot270_m_s8): Likewise. (vhcaddq_rot270_m_s32): Likewise. (vhcaddq_rot270_m_s16): Likewise. (vhcaddq_rot90_m_s8): Likewise. (vhcaddq_rot90_m_s32): Likewise. (vhcaddq_rot90_m_s16): Likewise. (vhsubq_m_n_s8): Likewise. (vhsubq_m_n_s32): Likewise. (vhsubq_m_n_s16): Likewise. (vhsubq_m_n_u8): Likewise. (vhsubq_m_n_u32): Likewise. (vhsubq_m_n_u16): Likewise. (vhsubq_m_s8): Likewise. (vhsubq_m_s32): Likewise. (vhsubq_m_s16): Likewise. (vhsubq_m_u8): Likewise. (vhsubq_m_u32): Likewise. (vhsubq_m_u16): Likewise. (vmaxq_m_s8): Likewise. (vmaxq_m_s32): Likewise. (vmaxq_m_s16): Likewise. (vmaxq_m_u8): Likewise. (vmaxq_m_u32): Likewise. (vmaxq_m_u16): Likewise. (vminq_m_s8): Likewise. (vminq_m_s32): Likewise. (vminq_m_s16): Likewise. (vminq_m_u8): Likewise. (vminq_m_u32): Likewise. (vminq_m_u16): Likewise. (vmladavaq_p_s8): Likewise. (vmladavaq_p_s32): Likewise. (vmladavaq_p_s16): Likewise. (vmladavaq_p_u8): Likewise. (vmladavaq_p_u32): Likewise. (vmladavaq_p_u16): Likewise. (vmladavaxq_p_s8): Likewise. (vmladavaxq_p_s32): Likewise. (vmladavaxq_p_s16): Likewise. (vmlaq_m_n_s8): Likewise. (vmlaq_m_n_s32): Likewise. (vmlaq_m_n_s16): Likewise. (vmlaq_m_n_u8): Likewise. (vmlaq_m_n_u32): Likewise. (vmlaq_m_n_u16): Likewise. (vmlasq_m_n_s8): Likewise. (vmlasq_m_n_s32): Likewise. (vmlasq_m_n_s16): Likewise. (vmlasq_m_n_u8): Likewise. (vmlasq_m_n_u32): Likewise. (vmlasq_m_n_u16): Likewise. (vmlsdavaq_p_s8): Likewise. (vmlsdavaq_p_s32): Likewise. (vmlsdavaq_p_s16): Likewise. (vmlsdavaxq_p_s8): Likewise. (vmlsdavaxq_p_s32): Likewise. (vmlsdavaxq_p_s16): Likewise. (vmulhq_m_s8): Likewise. (vmulhq_m_s32): Likewise. (vmulhq_m_s16): Likewise. (vmulhq_m_u8): Likewise. (vmulhq_m_u32): Likewise
[PATCH v2][ARM][GCC][4/4x]: MVE intrinsics with quaternary operands.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534345.html Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vabdq_m_f32, vabdq_m_f16, vaddq_m_f32, vaddq_m_f16, vaddq_m_n_f32, vaddq_m_n_f16, vandq_m_f32, vandq_m_f16, vbicq_m_f32, vbicq_m_f16, vbrsrq_m_n_f32, vbrsrq_m_n_f16, vcaddq_rot270_m_f32, vcaddq_rot270_m_f16, vcaddq_rot90_m_f32, vcaddq_rot90_m_f16, vcmlaq_m_f32, vcmlaq_m_f16, vcmlaq_rot180_m_f32, vcmlaq_rot180_m_f16, vcmlaq_rot270_m_f32, vcmlaq_rot270_m_f16, vcmlaq_rot90_m_f32, vcmlaq_rot90_m_f16, vcmulq_m_f32, vcmulq_m_f16, vcmulq_rot180_m_f32, vcmulq_rot180_m_f16, vcmulq_rot270_m_f32, vcmulq_rot270_m_f16, vcmulq_rot90_m_f32, vcmulq_rot90_m_f16, vcvtq_m_n_s32_f32, vcvtq_m_n_s16_f16, vcvtq_m_n_u32_f32, vcvtq_m_n_u16_f16, veorq_m_f32, veorq_m_f16, vfmaq_m_f32, vfmaq_m_f16, vfmaq_m_n_f32, vfmaq_m_n_f16, vfmasq_m_n_f32, vfmasq_m_n_f16, vfmsq_m_f32, vfmsq_m_f16, vmaxnmq_m_f32, vmaxnmq_m_f16, vminnmq_m_f32, vminnmq_m_f16, vmulq_m_f32, vmulq_m_f16, vmulq_m_n_f32, vmulq_m_n_f16, vornq_m_f32, vornq_m_f16, vorrq_m_f32, vorrq_m_f16, vsubq_m_f32, vsubq_m_f16, vsubq_m_n_f32, vsubq_m_n_f16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-31 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vabdq_m_f32): Define macro. (vabdq_m_f16): Likewise. (vaddq_m_f32): Likewise. (vaddq_m_f16): Likewise. (vaddq_m_n_f32): Likewise. (vaddq_m_n_f16): Likewise. (vandq_m_f32): Likewise. (vandq_m_f16): Likewise. (vbicq_m_f32): Likewise. (vbicq_m_f16): Likewise. (vbrsrq_m_n_f32): Likewise. (vbrsrq_m_n_f16): Likewise. (vcaddq_rot270_m_f32): Likewise. (vcaddq_rot270_m_f16): Likewise. (vcaddq_rot90_m_f32): Likewise. (vcaddq_rot90_m_f16): Likewise. (vcmlaq_m_f32): Likewise. (vcmlaq_m_f16): Likewise. (vcmlaq_rot180_m_f32): Likewise. (vcmlaq_rot180_m_f16): Likewise. (vcmlaq_rot270_m_f32): Likewise. (vcmlaq_rot270_m_f16): Likewise. (vcmlaq_rot90_m_f32): Likewise. (vcmlaq_rot90_m_f16): Likewise. (vcmulq_m_f32): Likewise. (vcmulq_m_f16): Likewise. (vcmulq_rot180_m_f32): Likewise. (vcmulq_rot180_m_f16): Likewise. (vcmulq_rot270_m_f32): Likewise. (vcmulq_rot270_m_f16): Likewise. (vcmulq_rot90_m_f32): Likewise. (vcmulq_rot90_m_f16): Likewise. (vcvtq_m_n_s32_f32): Likewise. (vcvtq_m_n_s16_f16): Likewise. (vcvtq_m_n_u32_f32): Likewise. (vcvtq_m_n_u16_f16): Likewise. (veorq_m_f32): Likewise. (veorq_m_f16): Likewise. (vfmaq_m_f32): Likewise. (vfmaq_m_f16): Likewise. (vfmaq_m_n_f32): Likewise. (vfmaq_m_n_f16): Likewise. (vfmasq_m_n_f32): Likewise. (vfmasq_m_n_f16): Likewise. (vfmsq_m_f32): Likewise. (vfmsq_m_f16): Likewise. (vmaxnmq_m_f32): Likewise. (vmaxnmq_m_f16): Likewise. (vminnmq_m_f32): Likewise. (vminnmq_m_f16): Likewise. (vmulq_m_f32): Likewise. (vmulq_m_f16): Likewise. (vmulq_m_n_f32): Likewise. (vmulq_m_n_f16): Likewise. (vornq_m_f32): Likewise. (vornq_m_f16): Likewise. (vorrq_m_f32): Likewise. (vorrq_m_f16): Likewise. (vsubq_m_f32): Likewise. (vsubq_m_f16): Likewise. (vsubq_m_n_f32): Likewise. (vsubq_m_n_f16): Likewise. (__attribute__): Likewise. (__arm_vabdq_m_f32): Likewise. (__arm_vabdq_m_f16): Likewise. (__arm_vaddq_m_f32): Likewise. (__arm_vaddq_m_f16): Likewise. (__arm_vaddq_m_n_f32): Likewise. (__arm_vaddq_m_n_f16): Likewise. (__arm_vandq_m_f32): Likewise. (__arm_vandq_m_f16): Likewise. (__arm_vbicq_m_f32): Likewise. (__arm_vbicq_m_f16): Likewise. (__arm_vbrsrq_m_n_f32): Likewise. (__arm_vbrsrq_m_n_f16): Likewise. (__arm_vcaddq_rot270_m_f32): Likewise. (__arm_vcaddq_rot270_m_f16): Likewise. (__arm_vcaddq_rot90_m_f32): Likewise. (__arm_vcaddq_rot90_m_f16): Likewise. (__arm_vcmlaq_m_f32): Likewise. (__arm_vcmlaq_m_f16): Likewise. (__arm_vcmlaq_rot180_m_f32): Likewise. (__arm_vcmlaq_rot180_m_f16): Likewise. (__arm_vcmlaq_rot270_m_f32): Likewise. (__arm_vcmlaq_rot270_m_f16): Likewise. (__arm_vcmlaq_rot90_m_f32): Likewise. (__arm_vcmlaq_rot90_m_f16): Likewise. (__arm_vcmulq_m_f32): Likewise
Re: [PATCH v3][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.
Hi Kyrill, This patches was already committed. I have resend this by mistake. Sorry, please ignore this. Regards SRI. From: Gcc-patches on behalf of Srinath Parvathaneni Sent: 18 March 2020 11:18 To: gcc-patches@gcc.gnu.org Subject: [PATCH v3][ARM][GCC][1/3x]: MVE intrinsics with ternary operands. Hello Kyrill, Following patch is the rebased version of v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542068.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vabavq_s8, vabavq_s16, vabavq_s32, vbicq_m_n_s16, vbicq_m_n_s32, vbicq_m_n_u16, vbicq_m_n_u32, vcmpeqq_m_f16, vcmpeqq_m_f32, vcvtaq_m_s16_f16, vcvtaq_m_u16_f16, vcvtaq_m_s32_f32, vcvtaq_m_u32_f32, vcvtq_m_f16_s16, vcvtq_m_f16_u16, vcvtq_m_f32_s32, vcvtq_m_f32_u32, vqrshrnbq_n_s16, vqrshrnbq_n_u16, vqrshrnbq_n_s32, vqrshrnbq_n_u32, vqrshrunbq_n_s16, vqrshrunbq_n_s32, vrmlaldavhaq_s32, vrmlaldavhaq_u32, vshlcq_s8, vshlcq_u8, vshlcq_s16, vshlcq_u16, vshlcq_s32, vshlcq_u32, vabavq_s8, vabavq_s16, vabavq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for ternary operands. (TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vabavq_s8): Define macro. (vabavq_s16): Likewise. (vabavq_s32): Likewise. (vbicq_m_n_s16): Likewise. (vbicq_m_n_s32): Likewise. (vbicq_m_n_u16): Likewise. (vbicq_m_n_u32): Likewise. (vcmpeqq_m_f16): Likewise. (vcmpeqq_m_f32): Likewise. (vcvtaq_m_s16_f16): Likewise. (vcvtaq_m_u16_f16): Likewise. (vcvtaq_m_s32_f32): Likewise. (vcvtaq_m_u32_f32): Likewise. (vcvtq_m_f16_s16): Likewise. (vcvtq_m_f16_u16): Likewise. (vcvtq_m_f32_s32): Likewise. (vcvtq_m_f32_u32): Likewise. (vqrshrnbq_n_s16): Likewise. (vqrshrnbq_n_u16): Likewise. (vqrshrnbq_n_s32): Likewise. (vqrshrnbq_n_u32): Likewise. (vqrshrunbq_n_s16): Likewise. (vqrshrunbq_n_s32): Likewise. (vrmlaldavhaq_s32): Likewise. (vrmlaldavhaq_u32): Likewise. (vshlcq_s8): Likewise. (vshlcq_u8): Likewise. (vshlcq_s16): Likewise. (vshlcq_u16): Likewise. (vshlcq_s32): Likewise. (vshlcq_u32): Likewise. (vabavq_u8): Likewise. (vabavq_u16): Likewise. (vabavq_u32): Likewise. (__arm_vabavq_s8): Define intrinsic. (__arm_vabavq_s16): Likewise. (__arm_vabavq_s32): Likewise. (__arm_vabavq_u8): Likewise. (__arm_vabavq_u16): Likewise. (__arm_vabavq_u32): Likewise. (__arm_vbicq_m_n_s16): Likewise. (__arm_vbicq_m_n_s32): Likewise. (__arm_vbicq_m_n_u16): Likewise. (__arm_vbicq_m_n_u32): Likewise. (__arm_vqrshrnbq_n_s16): Likewise. (__arm_vqrshrnbq_n_u16): Likewise. (__arm_vqrshrnbq_n_s32): Likewise. (__arm_vqrshrnbq_n_u32): Likewise. (__arm_vqrshrunbq_n_s16): Likewise. (__arm_vqrshrunbq_n_s32): Likewise. (__arm_vrmlaldavhaq_s32): Likewise. (__arm_vrmlaldavhaq_u32): Likewise. (__arm_vshlcq_s8): Likewise. (__arm_vshlcq_u8): Likewise. (__arm_vshlcq_s16): Likewise. (__arm_vshlcq_u16): Likewise. (__arm_vshlcq_s32): Likewise. (__arm_vshlcq_u32): Likewise. (__arm_vcmpeqq_m_f16): Likewise. (__arm_vcmpeqq_m_f32): Likewise. (__arm_vcvtaq_m_s16_f16): Likewise. (__arm_vcvtaq_m_u16_f16): Likewise. (__arm_vcvtaq_m_s32_f32): Likewise. (__arm_vcvtaq_m_u32_f32): Likewise. (__arm_vcvtq_m_f16_s16): Likewise. (__arm_vcvtq_m_f16_u16): Likewise. (__arm_vcvtq_m_f32_s32): Likewise. (__arm_vcvtq_m_f32_u32): Likewise
[PATCH v4][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.
Hello Kyrill, Following patch is the rebased version of v3. (version v3) https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542207.html Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vpselq_u8, vpselq_s8, vrev64q_m_u8, vqrdmlashq_n_u8, vqrdmlahq_n_u8, vqdmlahq_n_u8, vmvnq_m_u8, vmlasq_n_u8, vmlaq_n_u8, vmladavq_p_u8, vmladavaq_u8, vminvq_p_u8, vmaxvq_p_u8, vdupq_m_n_u8, vcmpneq_m_u8, vcmpneq_m_n_u8, vcmphiq_m_u8, vcmphiq_m_n_u8, vcmpeqq_m_u8, vcmpeqq_m_n_u8, vcmpcsq_m_u8, vcmpcsq_m_n_u8, vclzq_m_u8, vaddvaq_p_u8, vsriq_n_u8, vsliq_n_u8, vshlq_m_r_u8, vrshlq_m_n_u8, vqshlq_m_r_u8, vqrshlq_m_n_u8, vminavq_p_s8, vminaq_m_s8, vmaxavq_p_s8, vmaxaq_m_s8, vcmpneq_m_s8, vcmpneq_m_n_s8, vcmpltq_m_s8, vcmpltq_m_n_s8, vcmpleq_m_s8, vcmpleq_m_n_s8, vcmpgtq_m_s8, vcmpgtq_m_n_s8, vcmpgeq_m_s8, vcmpgeq_m_n_s8, vcmpeqq_m_s8, vcmpeqq_m_n_s8, vshlq_m_r_s8, vrshlq_m_n_s8, vrev64q_m_s8, vqshlq_m_r_s8, vqrshlq_m_n_s8, vqnegq_m_s8, vqabsq_m_s8, vnegq_m_s8, vmvnq_m_s8, vmlsdavxq_p_s8, vmlsdavq_p_s8, vmladavxq_p_s8, vmladavq_p_s8, vminvq_p_s8, vmaxvq_p_s8, vdupq_m_n_s8, vclzq_m_s8, vclsq_m_s8, vaddvaq_p_s8, vabsq_m_s8, vqrdmlsdhxq_s8, vqrdmlsdhq_s8, vqrdmlashq_n_s8, vqrdmlahq_n_s8, vqrdmladhxq_s8, vqrdmladhq_s8, vqdmlsdhxq_s8, vqdmlsdhq_s8, vqdmlahq_n_s8, vqdmladhxq_s8, vqdmladhq_s8, vmlsdavaxq_s8, vmlsdavaq_s8, vmlasq_n_s8, vmlaq_n_s8, vmladavaxq_s8, vmladavaq_s8, vsriq_n_s8, vsliq_n_s8, vpselq_u16, vpselq_s16, vrev64q_m_u16, vqrdmlashq_n_u16, vqrdmlahq_n_u16, vqdmlahq_n_u16, vmvnq_m_u16, vmlasq_n_u16, vmlaq_n_u16, vmladavq_p_u16, vmladavaq_u16, vminvq_p_u16, vmaxvq_p_u16, vdupq_m_n_u16, vcmpneq_m_u16, vcmpneq_m_n_u16, vcmphiq_m_u16, vcmphiq_m_n_u16, vcmpeqq_m_u16, vcmpeqq_m_n_u16, vcmpcsq_m_u16, vcmpcsq_m_n_u16, vclzq_m_u16, vaddvaq_p_u16, vsriq_n_u16, vsliq_n_u16, vshlq_m_r_u16, vrshlq_m_n_u16, vqshlq_m_r_u16, vqrshlq_m_n_u16, vminavq_p_s16, vminaq_m_s16, vmaxavq_p_s16, vmaxaq_m_s16, vcmpneq_m_s16, vcmpneq_m_n_s16, vcmpltq_m_s16, vcmpltq_m_n_s16, vcmpleq_m_s16, vcmpleq_m_n_s16, vcmpgtq_m_s16, vcmpgtq_m_n_s16, vcmpgeq_m_s16, vcmpgeq_m_n_s16, vcmpeqq_m_s16, vcmpeqq_m_n_s16, vshlq_m_r_s16, vrshlq_m_n_s16, vrev64q_m_s16, vqshlq_m_r_s16, vqrshlq_m_n_s16, vqnegq_m_s16, vqabsq_m_s16, vnegq_m_s16, vmvnq_m_s16, vmlsdavxq_p_s16, vmlsdavq_p_s16, vmladavxq_p_s16, vmladavq_p_s16, vminvq_p_s16, vmaxvq_p_s16, vdupq_m_n_s16, vclzq_m_s16, vclsq_m_s16, vaddvaq_p_s16, vabsq_m_s16, vqrdmlsdhxq_s16, vqrdmlsdhq_s16, vqrdmlashq_n_s16, vqrdmlahq_n_s16, vqrdmladhxq_s16, vqrdmladhq_s16, vqdmlsdhxq_s16, vqdmlsdhq_s16, vqdmlahq_n_s16, vqdmladhxq_s16, vqdmladhq_s16, vmlsdavaxq_s16, vmlsdavaq_s16, vmlasq_n_s16, vmlaq_n_s16, vmladavaxq_s16, vmladavaq_s16, vsriq_n_s16, vsliq_n_s16, vpselq_u32, vpselq_s32, vrev64q_m_u32, vqrdmlashq_n_u32, vqrdmlahq_n_u32, vqdmlahq_n_u32, vmvnq_m_u32, vmlasq_n_u32, vmlaq_n_u32, vmladavq_p_u32, vmladavaq_u32, vminvq_p_u32, vmaxvq_p_u32, vdupq_m_n_u32, vcmpneq_m_u32, vcmpneq_m_n_u32, vcmphiq_m_u32, vcmphiq_m_n_u32, vcmpeqq_m_u32, vcmpeqq_m_n_u32, vcmpcsq_m_u32, vcmpcsq_m_n_u32, vclzq_m_u32, vaddvaq_p_u32, vsriq_n_u32, vsliq_n_u32, vshlq_m_r_u32, vrshlq_m_n_u32, vqshlq_m_r_u32, vqrshlq_m_n_u32, vminavq_p_s32, vminaq_m_s32, vmaxavq_p_s32, vmaxaq_m_s32, vcmpneq_m_s32, vcmpneq_m_n_s32, vcmpltq_m_s32, vcmpltq_m_n_s32, vcmpleq_m_s32, vcmpleq_m_n_s32, vcmpgtq_m_s32, vcmpgtq_m_n_s32, vcmpgeq_m_s32, vcmpgeq_m_n_s32, vcmpeqq_m_s32, vcmpeqq_m_n_s32, vshlq_m_r_s32, vrshlq_m_n_s32, vrev64q_m_s32, vqshlq_m_r_s32, vqrshlq_m_n_s32, vqnegq_m_s32, vqabsq_m_s32, vnegq_m_s32, vmvnq_m_s32, vmlsdavxq_p_s32, vmlsdavq_p_s32, vmladavxq_p_s32, vmladavq_p_s32, vminvq_p_s32, vmaxvq_p_s32, vdupq_m_n_s32, vclzq_m_s32, vclsq_m_s32, vaddvaq_p_s32, vabsq_m_s32, vqrdmlsdhxq_s32, vqrdmlsdhq_s32, vqrdmlashq_n_s32, vqrdmlahq_n_s32, vqrdmladhxq_s32, vqrdmladhq_s32, vqdmlsdhxq_s32, vqdmlsdhq_s32, vqdmlahq_n_s32, vqdmladhxq_s32, vqdmladhq_s32, vmlsdavaxq_s32, vmlsdavaq_s32, vmlasq_n_s32, vmlaq_n_s32, vmladavaxq_s32, vmladavaq_s32, vsriq_n_s32, vsliq_n_s32, vpselq_u64, vpselq_s64. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rc" and "Re" are added, which checks the constant is with in the range of 0 to 15 and 0 to 31 respectively. Also a new predicates "mve_imm_15" and "mve_imm_31" are added, to check the the matching constraint Rc and Re respectively. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-25 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vpselq_u8): Define macro. (vpselq_s8): Likewise. (vrev64q_m_u8): Likewise. (vqrdmlashq_n_u8): Likewise. (vqrdmlahq_n_u8): Likewise. (vqdmlahq_n_u8): Likewise. (vmvnq_m_u8): L
[PATCH v2][ARM][GCC][1/5x]: MVE store intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534334.html Hello, This patch supports the following MVE ACLE store intrinsics. vstrbq_scatter_offset_s8, vstrbq_scatter_offset_s32, vstrbq_scatter_offset_s16, vstrbq_scatter_offset_u8, vstrbq_scatter_offset_u32, vstrbq_scatter_offset_u16, vstrbq_s8, vstrbq_s32, vstrbq_s16, vstrbq_u8, vstrbq_u32, vstrbq_u16, vstrwq_scatter_base_s32, vstrwq_scatter_base_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (STRS_QUALIFIERS): Define builtin qualifier. (STRU_QUALIFIERS): Likewise. (STRSS_QUALIFIERS): Likewise. (STRSU_QUALIFIERS): Likewise. (STRSBS_QUALIFIERS): Likewise. (STRSBU_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vstrbq_s8): Define macro. (vstrbq_u8): Likewise. (vstrbq_u16): Likewise. (vstrbq_scatter_offset_s8): Likewise. (vstrbq_scatter_offset_u8): Likewise. (vstrbq_scatter_offset_u16): Likewise. (vstrbq_s16): Likewise. (vstrbq_u32): Likewise. (vstrbq_scatter_offset_s16): Likewise. (vstrbq_scatter_offset_u32): Likewise. (vstrbq_s32): Likewise. (vstrbq_scatter_offset_s32): Likewise. (vstrwq_scatter_base_s32): Likewise. (vstrwq_scatter_base_u32): Likewise. (__arm_vstrbq_scatter_offset_s8): Define intrinsic. (__arm_vstrbq_scatter_offset_s32): Likewise. (__arm_vstrbq_scatter_offset_s16): Likewise. (__arm_vstrbq_scatter_offset_u8): Likewise. (__arm_vstrbq_scatter_offset_u32): Likewise. (__arm_vstrbq_scatter_offset_u16): Likewise. (__arm_vstrbq_s8): Likewise. (__arm_vstrbq_s32): Likewise. (__arm_vstrbq_s16): Likewise. (__arm_vstrbq_u8): Likewise. (__arm_vstrbq_u32): Likewise. (__arm_vstrbq_u16): Likewise. (__arm_vstrwq_scatter_base_s32): Likewise. (__arm_vstrwq_scatter_base_u32): Likewise. (vstrbq): Define polymorphic variant. (vstrbq_scatter_offset): Likewise. (vstrwq_scatter_base): Likewise. * config/arm/arm_mve_builtins.def (STRS_QUALIFIERS): Use builtin qualifier. (STRU_QUALIFIERS): Likewise. (STRSS_QUALIFIERS): Likewise. (STRSU_QUALIFIERS): Likewise. (STRSBS_QUALIFIERS): Likewise. (STRSBU_QUALIFIERS): Likewise. * config/arm/mve.md (MVE_B_ELEM): Define mode attribute iterator. (VSTRWSBQ): Define iterators. (VSTRBSOQ): Likewise. (VSTRBQ): Likewise. (mve_vstrbq_): Define RTL pattern. (mve_vstrbq_scatter_offset_): Likewise. (mve_vstrwq_scatter_base_v4si): Likewise. gcc/testsuite/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vstrbq_s16.c: New test. * gcc.target/arm/mve/intrinsics/vstrbq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 26f0379f62b95886414d2eb4d7c6a6c4fc235e60..b285f074285116ce621e324b644d43efb6538b9d 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -579,6 +579,39 @@ arm_quadop_unone_unone_unone_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS \ (arm_quadop_unone_unone_unone_none_unone_qualifiers) +static enum arm_type_qualifiers +arm_strs_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_void, qualifier_pointer, qualifier_none }; +#define STRS_QUALIFIERS (arm_strs_qualifiers) + +static enum
[PATCH v2][ARM][GCC][2/5x]: MVE load intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534338.html Hello, This patch supports the following MVE ACLE load intrinsics. vldrbq_gather_offset_u8, vldrbq_gather_offset_s8, vldrbq_s8, vldrbq_u8, vldrbq_gather_offset_u16, vldrbq_gather_offset_s16, vldrbq_s16, vldrbq_u16, vldrbq_gather_offset_u32, vldrbq_gather_offset_s32, vldrbq_s32, vldrbq_u32, vldrwq_gather_base_s32, vldrwq_gather_base_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (LDRGU_QUALIFIERS): Define builtin qualifier. (LDRGS_QUALIFIERS): Likewise. (LDRS_QUALIFIERS): Likewise. (LDRU_QUALIFIERS): Likewise. (LDRGBS_QUALIFIERS): Likewise. (LDRGBU_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vldrbq_gather_offset_u8): Define macro. (vldrbq_gather_offset_s8): Likewise. (vldrbq_s8): Likewise. (vldrbq_u8): Likewise. (vldrbq_gather_offset_u16): Likewise. (vldrbq_gather_offset_s16): Likewise. (vldrbq_s16): Likewise. (vldrbq_u16): Likewise. (vldrbq_gather_offset_u32): Likewise. (vldrbq_gather_offset_s32): Likewise. (vldrbq_s32): Likewise. (vldrbq_u32): Likewise. (vldrwq_gather_base_s32): Likewise. (vldrwq_gather_base_u32): Likewise. (__arm_vldrbq_gather_offset_u8): Define intrinsic. (__arm_vldrbq_gather_offset_s8): Likewise. (__arm_vldrbq_s8): Likewise. (__arm_vldrbq_u8): Likewise. (__arm_vldrbq_gather_offset_u16): Likewise. (__arm_vldrbq_gather_offset_s16): Likewise. (__arm_vldrbq_s16): Likewise. (__arm_vldrbq_u16): Likewise. (__arm_vldrbq_gather_offset_u32): Likewise. (__arm_vldrbq_gather_offset_s32): Likewise. (__arm_vldrbq_s32): Likewise. (__arm_vldrbq_u32): Likewise. (__arm_vldrwq_gather_base_s32): Likewise. (__arm_vldrwq_gather_base_u32): Likewise. (vldrbq_gather_offset): Define polymorphic variant. * config/arm/arm_mve_builtins.def (LDRGU_QUALIFIERS): Use builtin qualifier. (LDRGS_QUALIFIERS): Likewise. (LDRS_QUALIFIERS): Likewise. (LDRU_QUALIFIERS): Likewise. (LDRGBS_QUALIFIERS): Likewise. (LDRGBU_QUALIFIERS): Likewise. * config/arm/mve.md (VLDRBGOQ): Define iterator. (VLDRBQ): Likewise. (VLDRWGBQ): Likewise. (mve_vldrbq_gather_offset_): Define RTL pattern. (mve_vldrbq_): Likewise. (mve_vldrwq_gather_base_v4si): Likewise. gcc/testsuite/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s16.c: New test. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index b285f074285116ce621e324b644d43efb6538b9d..aced55f52d317e8deafdc6a6804db3b80c00fd80 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -612,6 +612,36 @@ arm_strsbu_qualifiers[SIMD_MAX_BUILTIN_ARGS] qualifier_unsigned}; #define STRSBU_QUALIFIERS (arm_strsbu_qualifiers) +static enum arm_type_qualifiers +arm_ldrgu_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_pointer, qualifier_unsigned}; +#define LDRGU_QUALIFIERS (arm_ldrgu_qualifiers) + +static enum arm_type_qualifiers +arm_ldrgs_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_pointer, qualifier_unsigned}; +#define LDRGS_QUALIFIERS (arm_ldrgs_qualifiers) + +static enum arm_type_qualifiers
[PATCH v2][ARM][GCC][3/5x]: MVE store intrinsics with predicated suffix.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534337.html Hello, This patch supports the following MVE ACLE store intrinsics with predicated suffix. vstrbq_p_s8, vstrbq_p_s32, vstrbq_p_s16, vstrbq_p_u8, vstrbq_p_u32, vstrbq_p_u16, vstrbq_scatter_offset_p_s8, vstrbq_scatter_offset_p_s32, vstrbq_scatter_offset_p_s16, vstrbq_scatter_offset_p_u8, vstrbq_scatter_offset_p_u32, vstrbq_scatter_offset_p_u16, vstrwq_scatter_base_p_s32, vstrwq_scatter_base_p_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (STRS_P_QUALIFIERS): Define builtin qualifier. (STRU_P_QUALIFIERS): Likewise. (STRSU_P_QUALIFIERS): Likewise. (STRSS_P_QUALIFIERS): Likewise. (STRSBS_P_QUALIFIERS): Likewise. (STRSBU_P_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vstrbq_p_s8): Define macro. (vstrbq_p_s32): Likewise. (vstrbq_p_s16): Likewise. (vstrbq_p_u8): Likewise. (vstrbq_p_u32): Likewise. (vstrbq_p_u16): Likewise. (vstrbq_scatter_offset_p_s8): Likewise. (vstrbq_scatter_offset_p_s32): Likewise. (vstrbq_scatter_offset_p_s16): Likewise. (vstrbq_scatter_offset_p_u8): Likewise. (vstrbq_scatter_offset_p_u32): Likewise. (vstrbq_scatter_offset_p_u16): Likewise. (vstrwq_scatter_base_p_s32): Likewise. (vstrwq_scatter_base_p_u32): Likewise. (__arm_vstrbq_p_s8): Define intrinsic. (__arm_vstrbq_p_s32): Likewise. (__arm_vstrbq_p_s16): Likewise. (__arm_vstrbq_p_u8): Likewise. (__arm_vstrbq_p_u32): Likewise. (__arm_vstrbq_p_u16): Likewise. (__arm_vstrbq_scatter_offset_p_s8): Likewise. (__arm_vstrbq_scatter_offset_p_s32): Likewise. (__arm_vstrbq_scatter_offset_p_s16): Likewise. (__arm_vstrbq_scatter_offset_p_u8): Likewise. (__arm_vstrbq_scatter_offset_p_u32): Likewise. (__arm_vstrbq_scatter_offset_p_u16): Likewise. (__arm_vstrwq_scatter_base_p_s32): Likewise. (__arm_vstrwq_scatter_base_p_u32): Likewise. (vstrbq_p): Define polymorphic variant. (vstrbq_scatter_offset_p): Likewise. (vstrwq_scatter_base_p): Likewise. * config/arm/arm_mve_builtins.def (STRS_P_QUALIFIERS): Use builtin qualifier. (STRU_P_QUALIFIERS): Likewise. (STRSU_P_QUALIFIERS): Likewise. (STRSS_P_QUALIFIERS): Likewise. (STRSBS_P_QUALIFIERS): Likewise. (STRSBU_P_QUALIFIERS): Likewise. * config/arm/mve.md (mve_vstrbq_scatter_offset_p_): Define RTL pattern. (mve_vstrwq_scatter_base_p_v4si): Likewise. (mve_vstrbq_p_): Likewise. gcc/testsuite/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vstrbq_p_s16.c: New test. * gcc.target/arm/mve/intrinsics/vstrbq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrbq_scatter_offset_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index aced55f52d317e8deafdc6a6804db3b80c00fd80..c87fa3118510e4de90ac9afe08608fb2315f4809 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -613,6 +613,41 @@ arm_strsbu_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define STRSBU_QUALIFIERS (arm_strsbu_qualifiers) static enum arm_type_qualifiers +arm_strs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_void, qualifier_pointer, qualifier_none, qualifier_unsigned}; +#define STRS_P_QUALIFIERS (arm_strs_p_qualifiers) + +static enum arm_type_qualifiers +arm_stru_p_qualifiers
[PATCH v2][ARM][GCC][5/5x]: MVE ACLE load intrinsics which load a byte, halfword, or word from memory.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534352.html Hello, This patch supports the following MVE ACLE load intrinsics which load a byte, halfword, or word from memory. vld1q_s8, vld1q_s32, vld1q_s16, vld1q_u8, vld1q_u32, vld1q_u16, vldrhq_gather_offset_s32, vldrhq_gather_offset_s16, vldrhq_gather_offset_u32, vldrhq_gather_offset_u16, vldrhq_gather_offset_z_s32, vldrhq_gather_offset_z_s16, vldrhq_gather_offset_z_u32, vldrhq_gather_offset_z_u16, vldrhq_gather_shifted_offset_s32,vldrwq_f32, vldrwq_z_f32, vldrhq_gather_shifted_offset_s16, vldrhq_gather_shifted_offset_u32, vldrhq_gather_shifted_offset_u16, vldrhq_gather_shifted_offset_z_s32, vldrhq_gather_shifted_offset_z_s16, vldrhq_gather_shifted_offset_z_u32, vldrhq_gather_shifted_offset_z_u16, vldrhq_s32, vldrhq_s16, vldrhq_u32, vldrhq_u16, vldrhq_z_s32, vldrhq_z_s16, vldrhq_z_u32, vldrhq_z_u16, vldrwq_s32, vldrwq_u32, vldrwq_z_s32, vldrwq_z_u32, vld1q_f32, vld1q_f16, vldrhq_f16, vldrhq_z_f16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vld1q_s8): Define macro. (vld1q_s32): Likewise. (vld1q_s16): Likewise. (vld1q_u8): Likewise. (vld1q_u32): Likewise. (vld1q_u16): Likewise. (vldrhq_gather_offset_s32): Likewise. (vldrhq_gather_offset_s16): Likewise. (vldrhq_gather_offset_u32): Likewise. (vldrhq_gather_offset_u16): Likewise. (vldrhq_gather_offset_z_s32): Likewise. (vldrhq_gather_offset_z_s16): Likewise. (vldrhq_gather_offset_z_u32): Likewise. (vldrhq_gather_offset_z_u16): Likewise. (vldrhq_gather_shifted_offset_s32): Likewise. (vldrhq_gather_shifted_offset_s16): Likewise. (vldrhq_gather_shifted_offset_u32): Likewise. (vldrhq_gather_shifted_offset_u16): Likewise. (vldrhq_gather_shifted_offset_z_s32): Likewise. (vldrhq_gather_shifted_offset_z_s16): Likewise. (vldrhq_gather_shifted_offset_z_u32): Likewise. (vldrhq_gather_shifted_offset_z_u16): Likewise. (vldrhq_s32): Likewise. (vldrhq_s16): Likewise. (vldrhq_u32): Likewise. (vldrhq_u16): Likewise. (vldrhq_z_s32): Likewise. (vldrhq_z_s16): Likewise. (vldrhq_z_u32): Likewise. (vldrhq_z_u16): Likewise. (vldrwq_s32): Likewise. (vldrwq_u32): Likewise. (vldrwq_z_s32): Likewise. (vldrwq_z_u32): Likewise. (vld1q_f32): Likewise. (vld1q_f16): Likewise. (vldrhq_f16): Likewise. (vldrhq_z_f16): Likewise. (vldrwq_f32): Likewise. (vldrwq_z_f32): Likewise. (__arm_vld1q_s8): Define intrinsic. (__arm_vld1q_s32): Likewise. (__arm_vld1q_s16): Likewise. (__arm_vld1q_u8): Likewise. (__arm_vld1q_u32): Likewise. (__arm_vld1q_u16): Likewise. (__arm_vldrhq_gather_offset_s32): Likewise. (__arm_vldrhq_gather_offset_s16): Likewise. (__arm_vldrhq_gather_offset_u32): Likewise. (__arm_vldrhq_gather_offset_u16): Likewise. (__arm_vldrhq_gather_offset_z_s32): Likewise. (__arm_vldrhq_gather_offset_z_s16): Likewise. (__arm_vldrhq_gather_offset_z_u32): Likewise. (__arm_vldrhq_gather_offset_z_u16): Likewise. (__arm_vldrhq_gather_shifted_offset_s32): Likewise. (__arm_vldrhq_gather_shifted_offset_s16): Likewise. (__arm_vldrhq_gather_shifted_offset_u32): Likewise. (__arm_vldrhq_gather_shifted_offset_u16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_s32): Likewise. (__arm_vldrhq_gather_shifted_offset_z_s16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_u32): Likewise. (__arm_vldrhq_gather_shifted_offset_z_u16): Likewise. (__arm_vldrhq_s32): Likewise. (__arm_vldrhq_s16): Likewise. (__arm_vldrhq_u32): Likewise. (__arm_vldrhq_u16): Likewise. (__arm_vldrhq_z_s32): Likewise. (__arm_vldrhq_z_s16): Likewise. (__arm_vldrhq_z_u32): Likewise. (__arm_vldrhq_z_u16): Likewise. (__arm_vldrwq_s32): Likewise. (__arm_vldrwq_u32): Likewise. (__arm_vldrwq_z_s32): Likewise. (__arm_vldrwq_z_u32): Likewise. (__arm_vld1q_f32): Likewise. (__arm_vld1q_f16): Likewise. (__arm_vldrwq_f32): Likewise. (__arm_vldrwq_z_f32): Likewise. (__arm_vldrhq_z_f16): Likewise. (__arm_vldrhq_f16): Likewise. (vld1q): Define polymorphic variant. (vldrhq_gather_offset): Likewise
[PATCH v2][ARM][GCC][8/5x]: Remaining MVE store intrinsics which stores an half word, word and double word to memory.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534340.html Hello, This patch supports the following MVE ACLE store intrinsics which stores an halfword, word or double word to memory. vstrdq_scatter_base_p_s64, vstrdq_scatter_base_p_u64, vstrdq_scatter_base_s64, vstrdq_scatter_base_u64, vstrdq_scatter_offset_p_s64, vstrdq_scatter_offset_p_u64, vstrdq_scatter_offset_s64, vstrdq_scatter_offset_u64, vstrdq_scatter_shifted_offset_p_s64, vstrdq_scatter_shifted_offset_p_u64, vstrdq_scatter_shifted_offset_s64, vstrdq_scatter_shifted_offset_u64, vstrhq_scatter_offset_f16, vstrhq_scatter_offset_p_f16, vstrhq_scatter_shifted_offset_f16, vstrhq_scatter_shifted_offset_p_f16, vstrwq_scatter_base_f32, vstrwq_scatter_base_p_f32, vstrwq_scatter_offset_f32, vstrwq_scatter_offset_p_f32, vstrwq_scatter_offset_p_s32, vstrwq_scatter_offset_p_u32, vstrwq_scatter_offset_s32, vstrwq_scatter_offset_u32, vstrwq_scatter_shifted_offset_f32, vstrwq_scatter_shifted_offset_p_f32, vstrwq_scatter_shifted_offset_p_s32, vstrwq_scatter_shifted_offset_p_u32, vstrwq_scatter_shifted_offset_s32, vstrwq_scatter_shifted_offset_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch a new predicate "Ri" is defined to check the immediate is in the range of +/-1016 and multiple of 8. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-05 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vstrdq_scatter_base_p_s64): Define macro. (vstrdq_scatter_base_p_u64): Likewise. (vstrdq_scatter_base_s64): Likewise. (vstrdq_scatter_base_u64): Likewise. (vstrdq_scatter_offset_p_s64): Likewise. (vstrdq_scatter_offset_p_u64): Likewise. (vstrdq_scatter_offset_s64): Likewise. (vstrdq_scatter_offset_u64): Likewise. (vstrdq_scatter_shifted_offset_p_s64): Likewise. (vstrdq_scatter_shifted_offset_p_u64): Likewise. (vstrdq_scatter_shifted_offset_s64): Likewise. (vstrdq_scatter_shifted_offset_u64): Likewise. (vstrhq_scatter_offset_f16): Likewise. (vstrhq_scatter_offset_p_f16): Likewise. (vstrhq_scatter_shifted_offset_f16): Likewise. (vstrhq_scatter_shifted_offset_p_f16): Likewise. (vstrwq_scatter_base_f32): Likewise. (vstrwq_scatter_base_p_f32): Likewise. (vstrwq_scatter_offset_f32): Likewise. (vstrwq_scatter_offset_p_f32): Likewise. (vstrwq_scatter_offset_p_s32): Likewise. (vstrwq_scatter_offset_p_u32): Likewise. (vstrwq_scatter_offset_s32): Likewise. (vstrwq_scatter_offset_u32): Likewise. (vstrwq_scatter_shifted_offset_f32): Likewise. (vstrwq_scatter_shifted_offset_p_f32): Likewise. (vstrwq_scatter_shifted_offset_p_s32): Likewise. (vstrwq_scatter_shifted_offset_p_u32): Likewise. (vstrwq_scatter_shifted_offset_s32): Likewise. (vstrwq_scatter_shifted_offset_u32): Likewise. (__arm_vstrdq_scatter_base_p_s64): Define intrinsic. (__arm_vstrdq_scatter_base_p_u64): Likewise. (__arm_vstrdq_scatter_base_s64): Likewise. (__arm_vstrdq_scatter_base_u64): Likewise. (__arm_vstrdq_scatter_offset_p_s64): Likewise. (__arm_vstrdq_scatter_offset_p_u64): Likewise. (__arm_vstrdq_scatter_offset_s64): Likewise. (__arm_vstrdq_scatter_offset_u64): Likewise. (__arm_vstrdq_scatter_shifted_offset_p_s64): Likewise. (__arm_vstrdq_scatter_shifted_offset_p_u64): Likewise. (__arm_vstrdq_scatter_shifted_offset_s64): Likewise. (__arm_vstrdq_scatter_shifted_offset_u64): Likewise. (__arm_vstrwq_scatter_offset_p_s32): Likewise. (__arm_vstrwq_scatter_offset_p_u32): Likewise. (__arm_vstrwq_scatter_offset_s32): Likewise. (__arm_vstrwq_scatter_offset_u32): Likewise. (__arm_vstrwq_scatter_shifted_offset_p_s32): Likewise. (__arm_vstrwq_scatter_shifted_offset_p_u32): Likewise. (__arm_vstrwq_scatter_shifted_offset_s32): Likewise. (__arm_vstrwq_scatter_shifted_offset_u32): Likewise. (__arm_vstrhq_scatter_offset_f16): Likewise. (__arm_vstrhq_scatter_offset_p_f16): Likewise. (__arm_vstrhq_scatter_shifted_offset_f16): Likewise. (__arm_vstrhq_scatter_shifted_offset_p_f16): Likewise. (__arm_vstrwq_scatter_base_f32): Likewise. (__arm_vstrwq_scatter_base_p_f32): Likewise. (__arm_vstrwq_scatter_offset_f32): Likewise. (__arm_vstrwq_scatter_offset_p_f32): Likewise. (__arm_vstrwq_scatter_shifted_offset_f32): Likewise. (__arm_vstrwq_scatter_shifted_offset_p_f32): Likewise. (vstrhq_scat
[PATCH v2][ARM][GCC][7/5x]: MVE store intrinsics which stores byte,half word or word to memory.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534335.html Hello, This patch supports the following MVE ACLE store intrinsics which stores a byte, halfword, or word to memory. vst1q_f32, vst1q_f16, vst1q_s8, vst1q_s32, vst1q_s16, vst1q_u8, vst1q_u32, vst1q_u16, vstrhq_f16, vstrhq_scatter_offset_s32, vstrhq_scatter_offset_s16, vstrhq_scatter_offset_u32, vstrhq_scatter_offset_u16, vstrhq_scatter_offset_p_s32, vstrhq_scatter_offset_p_s16, vstrhq_scatter_offset_p_u32, vstrhq_scatter_offset_p_u16, vstrhq_scatter_shifted_offset_s32, vstrhq_scatter_shifted_offset_s16, vstrhq_scatter_shifted_offset_u32, vstrhq_scatter_shifted_offset_u16, vstrhq_scatter_shifted_offset_p_s32, vstrhq_scatter_shifted_offset_p_s16, vstrhq_scatter_shifted_offset_p_u32, vstrhq_scatter_shifted_offset_p_u16, vstrhq_s32, vstrhq_s16, vstrhq_u32, vstrhq_u16, vstrhq_p_f16, vstrhq_p_s32, vstrhq_p_s16, vstrhq_p_u32, vstrhq_p_u16, vstrwq_f32, vstrwq_s32, vstrwq_u32, vstrwq_p_f32, vstrwq_p_s32, vstrwq_p_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vst1q_f32): Define macro. (vst1q_f16): Likewise. (vst1q_s8): Likewise. (vst1q_s32): Likewise. (vst1q_s16): Likewise. (vst1q_u8): Likewise. (vst1q_u32): Likewise. (vst1q_u16): Likewise. (vstrhq_f16): Likewise. (vstrhq_scatter_offset_s32): Likewise. (vstrhq_scatter_offset_s16): Likewise. (vstrhq_scatter_offset_u32): Likewise. (vstrhq_scatter_offset_u16): Likewise. (vstrhq_scatter_offset_p_s32): Likewise. (vstrhq_scatter_offset_p_s16): Likewise. (vstrhq_scatter_offset_p_u32): Likewise. (vstrhq_scatter_offset_p_u16): Likewise. (vstrhq_scatter_shifted_offset_s32): Likewise. (vstrhq_scatter_shifted_offset_s16): Likewise. (vstrhq_scatter_shifted_offset_u32): Likewise. (vstrhq_scatter_shifted_offset_u16): Likewise. (vstrhq_scatter_shifted_offset_p_s32): Likewise. (vstrhq_scatter_shifted_offset_p_s16): Likewise. (vstrhq_scatter_shifted_offset_p_u32): Likewise. (vstrhq_scatter_shifted_offset_p_u16): Likewise. (vstrhq_s32): Likewise. (vstrhq_s16): Likewise. (vstrhq_u32): Likewise. (vstrhq_u16): Likewise. (vstrhq_p_f16): Likewise. (vstrhq_p_s32): Likewise. (vstrhq_p_s16): Likewise. (vstrhq_p_u32): Likewise. (vstrhq_p_u16): Likewise. (vstrwq_f32): Likewise. (vstrwq_s32): Likewise. (vstrwq_u32): Likewise. (vstrwq_p_f32): Likewise. (vstrwq_p_s32): Likewise. (vstrwq_p_u32): Likewise. (__arm_vst1q_s8): Define intrinsic. (__arm_vst1q_s32): Likewise. (__arm_vst1q_s16): Likewise. (__arm_vst1q_u8): Likewise. (__arm_vst1q_u32): Likewise. (__arm_vst1q_u16): Likewise. (__arm_vstrhq_scatter_offset_s32): Likewise. (__arm_vstrhq_scatter_offset_s16): Likewise. (__arm_vstrhq_scatter_offset_u32): Likewise. (__arm_vstrhq_scatter_offset_u16): Likewise. (__arm_vstrhq_scatter_offset_p_s32): Likewise. (__arm_vstrhq_scatter_offset_p_s16): Likewise. (__arm_vstrhq_scatter_offset_p_u32): Likewise. (__arm_vstrhq_scatter_offset_p_u16): Likewise. (__arm_vstrhq_scatter_shifted_offset_s32): Likewise. (__arm_vstrhq_scatter_shifted_offset_s16): Likewise. (__arm_vstrhq_scatter_shifted_offset_u32): Likewise. (__arm_vstrhq_scatter_shifted_offset_u16): Likewise. (__arm_vstrhq_scatter_shifted_offset_p_s32): Likewise. (__arm_vstrhq_scatter_shifted_offset_p_s16): Likewise. (__arm_vstrhq_scatter_shifted_offset_p_u32): Likewise. (__arm_vstrhq_scatter_shifted_offset_p_u16): Likewise. (__arm_vstrhq_s32): Likewise. (__arm_vstrhq_s16): Likewise. (__arm_vstrhq_u32): Likewise. (__arm_vstrhq_u16): Likewise. (__arm_vstrhq_p_s32): Likewise. (__arm_vstrhq_p_s16): Likewise. (__arm_vstrhq_p_u32): Likewise. (__arm_vstrhq_p_u16): Likewise. (__arm_vstrwq_s32): Likewise. (__arm_vstrwq_u32): Likewise. (__arm_vstrwq_p_s32): Likewise. (__arm_vstrwq_p_u32): Likewise. (__arm_vstrwq_p_f32): Likewise. (__arm_vstrwq_f32): Likewise. (__arm_vst1q_f32): Likewise. (__arm_vst1q_f16): Likewise. (__arm_vstrhq_f16): Likewise. (__arm_vstrhq_p_f16): Likewise. (vst1q): Define polymorphic variant
[PATCH v2][ARM][GCC][6/5x]: Remaining MVE load intrinsics which loads half word and word or double word from memory.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534354.html Hello, This patch supports the following Remaining MVE ACLE load intrinsics which load an halfword, word or double word from memory. vldrdq_gather_base_s64, vldrdq_gather_base_u64, vldrdq_gather_base_z_s64, vldrdq_gather_base_z_u64, vldrdq_gather_offset_s64, vldrdq_gather_offset_u64, vldrdq_gather_offset_z_s64, vldrdq_gather_offset_z_u64, vldrdq_gather_shifted_offset_s64, vldrdq_gather_shifted_offset_u64, vldrdq_gather_shifted_offset_z_s64, vldrdq_gather_shifted_offset_z_u64, vldrhq_gather_offset_f16, vldrhq_gather_offset_z_f16, vldrhq_gather_shifted_offset_f16, vldrhq_gather_shifted_offset_z_f16, vldrwq_gather_base_f32, vldrwq_gather_base_z_f32, vldrwq_gather_offset_f32, vldrwq_gather_offset_s32, vldrwq_gather_offset_u32, vldrwq_gather_offset_z_f32, vldrwq_gather_offset_z_s32, vldrwq_gather_offset_z_u32, vldrwq_gather_shifted_offset_f32, vldrwq_gather_shifted_offset_s32, vldrwq_gather_shifted_offset_u32, vldrwq_gather_shifted_offset_z_f32, vldrwq_gather_shifted_offset_z_s32, vldrwq_gather_shifted_offset_z_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vldrdq_gather_base_s64): Define macro. (vldrdq_gather_base_u64): Likewise. (vldrdq_gather_base_z_s64): Likewise. (vldrdq_gather_base_z_u64): Likewise. (vldrdq_gather_offset_s64): Likewise. (vldrdq_gather_offset_u64): Likewise. (vldrdq_gather_offset_z_s64): Likewise. (vldrdq_gather_offset_z_u64): Likewise. (vldrdq_gather_shifted_offset_s64): Likewise. (vldrdq_gather_shifted_offset_u64): Likewise. (vldrdq_gather_shifted_offset_z_s64): Likewise. (vldrdq_gather_shifted_offset_z_u64): Likewise. (vldrhq_gather_offset_f16): Likewise. (vldrhq_gather_offset_z_f16): Likewise. (vldrhq_gather_shifted_offset_f16): Likewise. (vldrhq_gather_shifted_offset_z_f16): Likewise. (vldrwq_gather_base_f32): Likewise. (vldrwq_gather_base_z_f32): Likewise. (vldrwq_gather_offset_f32): Likewise. (vldrwq_gather_offset_s32): Likewise. (vldrwq_gather_offset_u32): Likewise. (vldrwq_gather_offset_z_f32): Likewise. (vldrwq_gather_offset_z_s32): Likewise. (vldrwq_gather_offset_z_u32): Likewise. (vldrwq_gather_shifted_offset_f32): Likewise. (vldrwq_gather_shifted_offset_s32): Likewise. (vldrwq_gather_shifted_offset_u32): Likewise. (vldrwq_gather_shifted_offset_z_f32): Likewise. (vldrwq_gather_shifted_offset_z_s32): Likewise. (vldrwq_gather_shifted_offset_z_u32): Likewise. (__arm_vldrdq_gather_base_s64): Define intrinsic. (__arm_vldrdq_gather_base_u64): Likewise. (__arm_vldrdq_gather_base_z_s64): Likewise. (__arm_vldrdq_gather_base_z_u64): Likewise. (__arm_vldrdq_gather_offset_s64): Likewise. (__arm_vldrdq_gather_offset_u64): Likewise. (__arm_vldrdq_gather_offset_z_s64): Likewise. (__arm_vldrdq_gather_offset_z_u64): Likewise. (__arm_vldrdq_gather_shifted_offset_s64): Likewise. (__arm_vldrdq_gather_shifted_offset_u64): Likewise. (__arm_vldrdq_gather_shifted_offset_z_s64): Likewise. (__arm_vldrdq_gather_shifted_offset_z_u64): Likewise. (__arm_vldrwq_gather_offset_s32): Likewise. (__arm_vldrwq_gather_offset_u32): Likewise. (__arm_vldrwq_gather_offset_z_s32): Likewise. (__arm_vldrwq_gather_offset_z_u32): Likewise. (__arm_vldrwq_gather_shifted_offset_s32): Likewise. (__arm_vldrwq_gather_shifted_offset_u32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_s32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_u32): Likewise. (__arm_vldrhq_gather_offset_f16): Likewise. (__arm_vldrhq_gather_offset_z_f16): Likewise. (__arm_vldrhq_gather_shifted_offset_f16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_f16): Likewise. (__arm_vldrwq_gather_base_f32): Likewise. (__arm_vldrwq_gather_base_z_f32): Likewise. (__arm_vldrwq_gather_offset_f32): Likewise. (__arm_vldrwq_gather_offset_z_f32): Likewise. (__arm_vldrwq_gather_shifted_offset_f32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_f32): Likewise. (vldrhq_gather_offset): Define polymorphic variant. (vldrhq_gather_offset_z): Likewise. (vldrhq_gather_shifted_offset): Likewise. (vldrhq_gather_shifted_offset_z): Likewise
[PATCH v2][ARM][GCC][4/5x]: MVE load intrinsics with zero(_z) suffix.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534333.html Hello, This patch supports the following MVE ACLE load intrinsics with zero(_z) suffix. * ``_z`` (zero) which indicates false-predicated lanes are filled with zeroes, these are only used for load instructions. vldrbq_gather_offset_z_s16, vldrbq_gather_offset_z_u8, vldrbq_gather_offset_z_s32, vldrbq_gather_offset_z_u16, vldrbq_gather_offset_z_u32, vldrbq_gather_offset_z_s8, vldrbq_z_s16, vldrbq_z_u8, vldrbq_z_s8, vldrbq_z_s32, vldrbq_z_u16, vldrbq_z_u32, vldrwq_gather_base_z_u32, vldrwq_gather_base_z_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (LDRGBS_Z_QUALIFIERS): Define builtin qualifier. (LDRGBU_Z_QUALIFIERS): Likewise. (LDRGS_Z_QUALIFIERS): Likewise. (LDRGU_Z_QUALIFIERS): Likewise. (LDRS_Z_QUALIFIERS): Likewise. (LDRU_Z_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vldrbq_gather_offset_z_s16): Define macro. (vldrbq_gather_offset_z_u8): Likewise. (vldrbq_gather_offset_z_s32): Likewise. (vldrbq_gather_offset_z_u16): Likewise. (vldrbq_gather_offset_z_u32): Likewise. (vldrbq_gather_offset_z_s8): Likewise. (vldrbq_z_s16): Likewise. (vldrbq_z_u8): Likewise. (vldrbq_z_s8): Likewise. (vldrbq_z_s32): Likewise. (vldrbq_z_u16): Likewise. (vldrbq_z_u32): Likewise. (vldrwq_gather_base_z_u32): Likewise. (vldrwq_gather_base_z_s32): Likewise. (__arm_vldrbq_gather_offset_z_s8): Define intrinsic. (__arm_vldrbq_gather_offset_z_s32): Likewise. (__arm_vldrbq_gather_offset_z_s16): Likewise. (__arm_vldrbq_gather_offset_z_u8): Likewise. (__arm_vldrbq_gather_offset_z_u32): Likewise. (__arm_vldrbq_gather_offset_z_u16): Likewise. (__arm_vldrbq_z_s8): Likewise. (__arm_vldrbq_z_s32): Likewise. (__arm_vldrbq_z_s16): Likewise. (__arm_vldrbq_z_u8): Likewise. (__arm_vldrbq_z_u32): Likewise. (__arm_vldrbq_z_u16): Likewise. (__arm_vldrwq_gather_base_z_s32): Likewise. (__arm_vldrwq_gather_base_z_u32): Likewise. (vldrbq_gather_offset_z): Define polymorphic variant. * config/arm/arm_mve_builtins.def (LDRGBS_Z_QUALIFIERS): Use builtin qualifier. (LDRGBU_Z_QUALIFIERS): Likewise. (LDRGS_Z_QUALIFIERS): Likewise. (LDRGU_Z_QUALIFIERS): Likewise. (LDRS_Z_QUALIFIERS): Likewise. (LDRU_Z_QUALIFIERS): Likewise. * config/arm/mve.md (mve_vldrbq_gather_offset_z_): Define RTL pattern. (mve_vldrbq_z_): Likewise. (mve_vldrwq_gather_base_z_v4si): Likewise. gcc/testsuite/ChangeLog: Likewise. 2019-11-01 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s16.c: New test. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_gather_offset_z_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_z_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index c87fa3118510e4de90ac9afe08608fb2315f4809..c3deb9efc8849019141b6430543e93605fda4af4 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -677,6 +677,40 @@ arm_ldrgbu_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate}; #define LDRGBU_QUALIFIERS (arm_ldrgbu_qualifiers) +static enum arm_type_qualifiers +arm_ldrgbs_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned, qualifier_immediate, + qualifier_unsigned
[PATCH v2][ARM][GCC][6x]:MVE ACLE vaddq intrinsics using arithmetic plus operator.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534349.html Hello, This patch supports following MVE ACLE vaddq intrinsics. The RTL patterns for this intrinsics are added using arithmetic "plus" operator. vaddq_s8, vaddq_s16, vaddq_s32, vaddq_u8, vaddq_u16, vaddq_u32, vaddq_f16, vaddq_f32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-19 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm_mve.h (vaddq_s8): Define macro. (vaddq_s16): Likewise. (vaddq_s32): Likewise. (vaddq_u8): Likewise. (vaddq_u16): Likewise. (vaddq_u32): Likewise. (vaddq_f16): Likewise. (vaddq_f32): Likewise. (__arm_vaddq_s8): Define intrinsic. (__arm_vaddq_s16): Likewise. (__arm_vaddq_s32): Likewise. (__arm_vaddq_u8): Likewise. (__arm_vaddq_u16): Likewise. (__arm_vaddq_u32): Likewise. (__arm_vaddq_f16): Likewise. (__arm_vaddq_f32): Likewise. (vaddq): Define polymorphic variant. * config/arm/iterators.md (VNIM): Define mode iterator for common types Neon, IWMMXT and MVE. (VNINOTM): Likewise. * config/arm/mve.md (mve_vaddq): Define RTL pattern. (mve_vaddq_f): Define RTL pattern. * config/arm/neon.md (add3): Rename to addv4hf3 RTL pattern. (addv8hf3_neon): Define RTL pattern. * config/arm/vec-common.md (add3): Modify standard add RTL pattern to support MVE. (addv8hf3): Define standard RTL pattern for MVE and Neon. (add3): Modify existing standard add RTL pattern for Neon and IWMMXT. gcc/testsuite/ChangeLog: 2020-03-19 Srinath Parvathaneni Andre Vieira Mihail Ionescu * gcc.target/arm/mve/intrinsics/vaddq_f16.c: New test. * gcc.target/arm/mve/intrinsics/vaddq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vaddq_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 5ea42bd6a5bd98d5c77a0e7da3464ba6b431770b..55c256910bb7f4c616ea592be699f7f4fc3f17f7 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -1898,6 +1898,14 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vstrwq_scatter_shifted_offset_p_u32(__base, __offset, __value, __p) __arm_vstrwq_scatter_shifted_offset_p_u32(__base, __offset, __value, __p) #define vstrwq_scatter_shifted_offset_s32(__base, __offset, __value) __arm_vstrwq_scatter_shifted_offset_s32(__base, __offset, __value) #define vstrwq_scatter_shifted_offset_u32(__base, __offset, __value) __arm_vstrwq_scatter_shifted_offset_u32(__base, __offset, __value) +#define vaddq_s8(__a, __b) __arm_vaddq_s8(__a, __b) +#define vaddq_s16(__a, __b) __arm_vaddq_s16(__a, __b) +#define vaddq_s32(__a, __b) __arm_vaddq_s32(__a, __b) +#define vaddq_u8(__a, __b) __arm_vaddq_u8(__a, __b) +#define vaddq_u16(__a, __b) __arm_vaddq_u16(__a, __b) +#define vaddq_u32(__a, __b) __arm_vaddq_u32(__a, __b) +#define vaddq_f16(__a, __b) __arm_vaddq_f16(__a, __b) +#define vaddq_f32(__a, __b) __arm_vaddq_f32(__a, __b) #endif __extension__ extern __inline void @@ -12341,6 +12349,48 @@ __arm_vstrwq_scatter_shifted_offset_u32 (uint32_t * __base, uint32x4_t __offset, __builtin_mve_vstrwq_scatter_shifted_offset_uv4si ((__builtin_neon_si *) __base, __offset, __value); } +__extension__ extern __inline int8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vaddq_s8 (int8x16_t __a, int8x16_t __b) +{ + return __a + __b; +} + +__extension__ extern __inline int16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vaddq_s16 (int16x8_t __a, int16x8_t __b) +{ + return __a + __b; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vaddq_s32 (int32x4_t __a, int32x4_t __b) +{ + return __a + __b; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vaddq_u8 (uint8x16_t __a, uint8x16_t __b) +{ + return __a + __b; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __ar
[PATCH v2][ARM][GCC][7x]: MVE vreinterpretq and vuninitializedq intrinsics.
Hello Kyrill, This patch addresses all the comments in patch version v2. (version v2) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534351.html Hello, This patch supports following MVE ACLE intrinsics. vreinterpretq_s16_s32, vreinterpretq_s16_s64, vreinterpretq_s16_s8, vreinterpretq_s16_u16, vreinterpretq_s16_u32, vreinterpretq_s16_u64, vreinterpretq_s16_u8, vreinterpretq_s32_s16, vreinterpretq_s32_s64, vreinterpretq_s32_s8, vreinterpretq_s32_u16, vreinterpretq_s32_u32, vreinterpretq_s32_u64, vreinterpretq_s32_u8, vreinterpretq_s64_s16, vreinterpretq_s64_s32, vreinterpretq_s64_s8, vreinterpretq_s64_u16, vreinterpretq_s64_u32, vreinterpretq_s64_u64, vreinterpretq_s64_u8, vreinterpretq_s8_s16, vreinterpretq_s8_s32, vreinterpretq_s8_s64, vreinterpretq_s8_u16, vreinterpretq_s8_u32, vreinterpretq_s8_u64, vreinterpretq_s8_u8, vreinterpretq_u16_s16, vreinterpretq_u16_s32, vreinterpretq_u16_s64, vreinterpretq_u16_s8, vreinterpretq_u16_u32, vreinterpretq_u16_u64, vreinterpretq_u16_u8, vreinterpretq_u32_s16, vreinterpretq_u32_s32, vreinterpretq_u32_s64, vreinterpretq_u32_s8, vreinterpretq_u32_u16, vreinterpretq_u32_u64, vreinterpretq_u32_u8, vreinterpretq_u64_s16, vreinterpretq_u64_s32, vreinterpretq_u64_s64, vreinterpretq_u64_s8, vreinterpretq_u64_u16, vreinterpretq_u64_u32, vreinterpretq_u64_u8, vreinterpretq_u8_s16, vreinterpretq_u8_s32, vreinterpretq_u8_s64, vreinterpretq_u8_s8, vreinterpretq_u8_u16, vreinterpretq_u8_u32, vreinterpretq_u8_u64, vreinterpretq_s32_f16, vreinterpretq_s32_f32, vreinterpretq_u16_f16, vreinterpretq_u16_f32, vreinterpretq_u32_f16, vreinterpretq_u32_f32, vreinterpretq_u64_f16, vreinterpretq_u64_f32, vreinterpretq_u8_f16, vreinterpretq_u8_f32, vreinterpretq_f16_f32, vreinterpretq_f16_s16, vreinterpretq_f16_s32, vreinterpretq_f16_s64, vreinterpretq_f16_s8, vreinterpretq_f16_u16, vreinterpretq_f16_u32, vreinterpretq_f16_u64, vreinterpretq_f16_u8, vreinterpretq_f32_f16, vreinterpretq_f32_s16, vreinterpretq_f32_s32, vreinterpretq_f32_s64, vreinterpretq_f32_s8, vreinterpretq_f32_u16, vreinterpretq_f32_u32, vreinterpretq_f32_u64, vreinterpretq_f32_u8, vreinterpretq_s16_f16, vreinterpretq_s16_f32, vreinterpretq_s64_f16, vreinterpretq_s64_f32, vreinterpretq_s8_f16, vreinterpretq_s8_f32, vuninitializedq_u8, vuninitializedq_u16, vuninitializedq_u32, vuninitializedq_u64, vuninitializedq_s8, vuninitializedq_s16, vuninitializedq_s32, vuninitializedq_s64, vuninitializedq_f16, vuninitializedq_f32 and vuninitializedq. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-19 Srinath Parvathaneni * config/arm/arm_mve.h (vreinterpretq_s16_s32): Define macro. (vreinterpretq_s16_s64): Likewise. (vreinterpretq_s16_s8): Likewise. (vreinterpretq_s16_u16): Likewise. (vreinterpretq_s16_u32): Likewise. (vreinterpretq_s16_u64): Likewise. (vreinterpretq_s16_u8): Likewise. (vreinterpretq_s32_s16): Likewise. (vreinterpretq_s32_s64): Likewise. (vreinterpretq_s32_s8): Likewise. (vreinterpretq_s32_u16): Likewise. (vreinterpretq_s32_u32): Likewise. (vreinterpretq_s32_u64): Likewise. (vreinterpretq_s32_u8): Likewise. (vreinterpretq_s64_s16): Likewise. (vreinterpretq_s64_s32): Likewise. (vreinterpretq_s64_s8): Likewise. (vreinterpretq_s64_u16): Likewise. (vreinterpretq_s64_u32): Likewise. (vreinterpretq_s64_u64): Likewise. (vreinterpretq_s64_u8): Likewise. (vreinterpretq_s8_s16): Likewise. (vreinterpretq_s8_s32): Likewise. (vreinterpretq_s8_s64): Likewise. (vreinterpretq_s8_u16): Likewise. (vreinterpretq_s8_u32): Likewise. (vreinterpretq_s8_u64): Likewise. (vreinterpretq_s8_u8): Likewise. (vreinterpretq_u16_s16): Likewise. (vreinterpretq_u16_s32): Likewise. (vreinterpretq_u16_s64): Likewise. (vreinterpretq_u16_s8): Likewise. (vreinterpretq_u16_u32): Likewise. (vreinterpretq_u16_u64): Likewise. (vreinterpretq_u16_u8): Likewise. (vreinterpretq_u32_s16): Likewise. (vreinterpretq_u32_s32): Likewise. (vreinterpretq_u32_s64): Likewise. (vreinterpretq_u32_s8): Likewise. (vreinterpretq_u32_u16): Likewise. (vreinterpretq_u32_u64): Likewise. (vreinterpretq_u32_u8): Likewise. (vreinterpretq_u64_s16): Likewise. (vreinterpretq_u64_s32): Likewise. (vreinterpretq_u64_s64): Likewise. (vreinterpretq_u64_s8): Likewise. (vreinterpretq_u64_u16): Likewise. (vreinterpretq_u64_u32): Likewise. (vreinterpretq_u64_u8): Likewise. (vreinterpretq_u8_s16): Likewise
[PATCH v2][ARM][GCC][1/8x]: MVE ACLE vidup, vddup, viwdup and vdwdup intrinsics with writeback.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534341.html Hello, This patch supports following MVE ACLE intrinsics with writeback. vddupq_m_n_u8, vddupq_m_n_u32, vddupq_m_n_u16, vddupq_m_wb_u8, vddupq_m_wb_u16, vddupq_m_wb_u32, vddupq_n_u8, vddupq_n_u32, vddupq_n_u16, vddupq_wb_u8, vddupq_wb_u16, vddupq_wb_u32, vdwdupq_m_n_u8, vdwdupq_m_n_u32, vdwdupq_m_n_u16, vdwdupq_m_wb_u8, vdwdupq_m_wb_u32, vdwdupq_m_wb_u16, vdwdupq_n_u8, vdwdupq_n_u32, vdwdupq_n_u16, vdwdupq_wb_u8, vdwdupq_wb_u32, vdwdupq_wb_u16, vidupq_m_n_u8, vidupq_m_n_u32, vidupq_m_n_u16, vidupq_m_wb_u8, vidupq_m_wb_u16, vidupq_m_wb_u32, vidupq_n_u8, vidupq_n_u32, vidupq_n_u16, vidupq_wb_u8, vidupq_wb_u16, vidupq_wb_u32, viwdupq_m_n_u8, viwdupq_m_n_u32, viwdupq_m_n_u16, viwdupq_m_wb_u8, viwdupq_m_wb_u32, viwdupq_m_wb_u16, viwdupq_n_u8, viwdupq_n_u32, viwdupq_n_u16, viwdupq_wb_u8, viwdupq_wb_u32, viwdupq_wb_u16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm-builtins.c (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Define quinary builtin qualifier. * config/arm/arm_mve.h (vddupq_m_n_u8): Define macro. (vddupq_m_n_u32): Likewise. (vddupq_m_n_u16): Likewise. (vddupq_m_wb_u8): Likewise. (vddupq_m_wb_u16): Likewise. (vddupq_m_wb_u32): Likewise. (vddupq_n_u8): Likewise. (vddupq_n_u32): Likewise. (vddupq_n_u16): Likewise. (vddupq_wb_u8): Likewise. (vddupq_wb_u16): Likewise. (vddupq_wb_u32): Likewise. (vdwdupq_m_n_u8): Likewise. (vdwdupq_m_n_u32): Likewise. (vdwdupq_m_n_u16): Likewise. (vdwdupq_m_wb_u8): Likewise. (vdwdupq_m_wb_u32): Likewise. (vdwdupq_m_wb_u16): Likewise. (vdwdupq_n_u8): Likewise. (vdwdupq_n_u32): Likewise. (vdwdupq_n_u16): Likewise. (vdwdupq_wb_u8): Likewise. (vdwdupq_wb_u32): Likewise. (vdwdupq_wb_u16): Likewise. (vidupq_m_n_u8): Likewise. (vidupq_m_n_u32): Likewise. (vidupq_m_n_u16): Likewise. (vidupq_m_wb_u8): Likewise. (vidupq_m_wb_u16): Likewise. (vidupq_m_wb_u32): Likewise. (vidupq_n_u8): Likewise. (vidupq_n_u32): Likewise. (vidupq_n_u16): Likewise. (vidupq_wb_u8): Likewise. (vidupq_wb_u16): Likewise. (vidupq_wb_u32): Likewise. (viwdupq_m_n_u8): Likewise. (viwdupq_m_n_u32): Likewise. (viwdupq_m_n_u16): Likewise. (viwdupq_m_wb_u8): Likewise. (viwdupq_m_wb_u32): Likewise. (viwdupq_m_wb_u16): Likewise. (viwdupq_n_u8): Likewise. (viwdupq_n_u32): Likewise. (viwdupq_n_u16): Likewise. (viwdupq_wb_u8): Likewise. (viwdupq_wb_u32): Likewise. (viwdupq_wb_u16): Likewise. (__arm_vddupq_m_n_u8): Define intrinsic. (__arm_vddupq_m_n_u32): Likewise. (__arm_vddupq_m_n_u16): Likewise. (__arm_vddupq_m_wb_u8): Likewise. (__arm_vddupq_m_wb_u16): Likewise. (__arm_vddupq_m_wb_u32): Likewise. (__arm_vddupq_n_u8): Likewise. (__arm_vddupq_n_u32): Likewise. (__arm_vddupq_n_u16): Likewise. (__arm_vdwdupq_m_n_u8): Likewise. (__arm_vdwdupq_m_n_u32): Likewise. (__arm_vdwdupq_m_n_u16): Likewise. (__arm_vdwdupq_m_wb_u8): Likewise. (__arm_vdwdupq_m_wb_u32): Likewise. (__arm_vdwdupq_m_wb_u16): Likewise. (__arm_vdwdupq_n_u8): Likewise. (__arm_vdwdupq_n_u32): Likewise. (__arm_vdwdupq_n_u16): Likewise. (__arm_vdwdupq_wb_u8): Likewise. (__arm_vdwdupq_wb_u32): Likewise. (__arm_vdwdupq_wb_u16): Likewise. (__arm_vidupq_m_n_u8): Likewise. (__arm_vidupq_m_n_u32): Likewise. (__arm_vidupq_m_n_u16): Likewise. (__arm_vidupq_n_u8): Likewise. (__arm_vidupq_m_wb_u8): Likewise. (__arm_vidupq_m_wb_u16): Likewise. (__arm_vidupq_m_wb_u32): Likewise. (__arm_vidupq_n_u32): Likewise. (__arm_vidupq_n_u16): Likewise. (__arm_vidupq_wb_u8): Likewise. (__arm_vidupq_wb_u16): Likewise. (__arm_vidupq_wb_u32): Likewise. (__arm_vddupq_wb_u8): Likewise. (__arm_vddupq_wb_u16): Likewise. (__arm_vddupq_wb_u32): Likewise. (__arm_viwdupq_m_n_u8): Likewise. (__arm_viwdupq_m_n_u32): Likewise. (__arm_viwdupq_m_n_u16): Likewise. (__arm_viwdupq_m_wb_u8): Likewise. (__arm_viwdupq_m_wb_u32): Likewise. (__arm_viwdupq_m_wb_u16
[PATCH v2][ARM][GCC][2/8x]: MVE ACLE gather load and scatter store intrinsics with writeback.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534357.html Hello, This patch supports following MVE ACLE intrinsics with writeback. vldrdq_gather_base_wb_s64, vldrdq_gather_base_wb_u64, vldrdq_gather_base_wb_z_s64, vldrdq_gather_base_wb_z_u64, vldrwq_gather_base_wb_f32, vldrwq_gather_base_wb_s32, vldrwq_gather_base_wb_u32, vldrwq_gather_base_wb_z_f32, vldrwq_gather_base_wb_z_s32, vldrwq_gather_base_wb_z_u32, vstrdq_scatter_base_wb_p_s64, vstrdq_scatter_base_wb_p_u64, vstrdq_scatter_base_wb_s64, vstrdq_scatter_base_wb_u64, vstrwq_scatter_base_wb_p_s32, vstrwq_scatter_base_wb_p_f32, vstrwq_scatter_base_wb_p_u32, vstrwq_scatter_base_wb_s32, vstrwq_scatter_base_wb_u32, vstrwq_scatter_base_wb_f32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm-builtins.c (LDRGBWBS_QUALIFIERS): Define builtin qualifier. (LDRGBWBU_QUALIFIERS): Likewise. (LDRGBWBS_Z_QUALIFIERS): Likewise. (LDRGBWBU_Z_QUALIFIERS): Likewise. (STRSBWBS_QUALIFIERS): Likewise. (STRSBWBU_QUALIFIERS): Likewise. (STRSBWBS_P_QUALIFIERS): Likewise. (STRSBWBU_P_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vldrdq_gather_base_wb_s64): Define macro. (vldrdq_gather_base_wb_u64): Likewise. (vldrdq_gather_base_wb_z_s64): Likewise. (vldrdq_gather_base_wb_z_u64): Likewise. (vldrwq_gather_base_wb_f32): Likewise. (vldrwq_gather_base_wb_s32): Likewise. (vldrwq_gather_base_wb_u32): Likewise. (vldrwq_gather_base_wb_z_f32): Likewise. (vldrwq_gather_base_wb_z_s32): Likewise. (vldrwq_gather_base_wb_z_u32): Likewise. (vstrdq_scatter_base_wb_p_s64): Likewise. (vstrdq_scatter_base_wb_p_u64): Likewise. (vstrdq_scatter_base_wb_s64): Likewise. (vstrdq_scatter_base_wb_u64): Likewise. (vstrwq_scatter_base_wb_p_s32): Likewise. (vstrwq_scatter_base_wb_p_f32): Likewise. (vstrwq_scatter_base_wb_p_u32): Likewise. (vstrwq_scatter_base_wb_s32): Likewise. (vstrwq_scatter_base_wb_u32): Likewise. (vstrwq_scatter_base_wb_f32): Likewise. (__arm_vldrdq_gather_base_wb_s64): Define intrinsic. (__arm_vldrdq_gather_base_wb_u64): Likewise. (__arm_vldrdq_gather_base_wb_z_s64): Likewise. (__arm_vldrdq_gather_base_wb_z_u64): Likewise. (__arm_vldrwq_gather_base_wb_s32): Likewise. (__arm_vldrwq_gather_base_wb_u32): Likewise. (__arm_vldrwq_gather_base_wb_z_s32): Likewise. (__arm_vldrwq_gather_base_wb_z_u32): Likewise. (__arm_vstrdq_scatter_base_wb_s64): Likewise. (__arm_vstrdq_scatter_base_wb_u64): Likewise. (__arm_vstrdq_scatter_base_wb_p_s64): Likewise. (__arm_vstrdq_scatter_base_wb_p_u64): Likewise. (__arm_vstrwq_scatter_base_wb_p_s32): Likewise. (__arm_vstrwq_scatter_base_wb_p_u32): Likewise. (__arm_vstrwq_scatter_base_wb_s32): Likewise. (__arm_vstrwq_scatter_base_wb_u32): Likewise. (__arm_vldrwq_gather_base_wb_f32): Likewise. (__arm_vldrwq_gather_base_wb_z_f32): Likewise. (__arm_vstrwq_scatter_base_wb_f32): Likewise. (__arm_vstrwq_scatter_base_wb_p_f32): Likewise. (vstrwq_scatter_base_wb): Define polymorphic variant. (vstrwq_scatter_base_wb_p): Likewise. (vstrdq_scatter_base_wb_p): Likewise. (vstrdq_scatter_base_wb): Likewise. * config/arm/arm_mve_builtins.def (LDRGBWBS_QUALIFIERS): Use builtin qualifier. * config/arm/mve.md (mve_vstrwq_scatter_base_wb_v4si): Define RTL pattern. (mve_vstrwq_scatter_base_wb_add_v4si): Likewise. (mve_vstrwq_scatter_base_wb_v4si_insn): Likewise. (mve_vstrwq_scatter_base_wb_p_v4si): Likewise. (mve_vstrwq_scatter_base_wb_p_add_v4si): Likewise. (mve_vstrwq_scatter_base_wb_p_v4si_insn): Likewise. (mve_vstrwq_scatter_base_wb_fv4sf): Likewise. (mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise. (mve_vstrwq_scatter_base_wb_fv4sf_insn): Likewise. (mve_vstrwq_scatter_base_wb_p_fv4sf): Likewise. (mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise. (mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Likewise. (mve_vstrdq_scatter_base_wb_v2di): Likewise. (mve_vstrdq_scatter_base_wb_add_v2di): Likewise. (mve_vstrdq_scatter_base_wb_v2di_insn): Likewise. (mve_vstrdq_scatter_base_wb_p_v2di): Likewise. (mve_vstrdq_scatter_base_wb_p_add_v2di): Likewise
[PATCH v2][ARM][GCC][9x]: MVE ACLE predicated intrinsics with (dont-care) variant.
, vrev64q_x_s8, vrev64q_x_u16, vrev64q_x_u32, vrev64q_x_u8, vrhaddq_x_s16, vrhaddq_x_s32, vrhaddq_x_s8, vrhaddq_x_u16, vrhaddq_x_u32, vrhaddq_x_u8, vrmulhq_x_s16, vrmulhq_x_s32, vrmulhq_x_s8, vrmulhq_x_u16, vrmulhq_x_u32, vrmulhq_x_u8, vrndaq_x_f16, vrndaq_x_f32, vrndmq_x_f16, vrndmq_x_f32, vrndnq_x_f16, vrndnq_x_f32, vrndpq_x_f16, vrndpq_x_f32, vrndq_x_f16, vrndq_x_f32, vrndxq_x_f16, vrndxq_x_f32, vrshlq_x_s16, vrshlq_x_s32, vrshlq_x_s8, vrshlq_x_u16, vrshlq_x_u32, vrshlq_x_u8, vrshrq_x_n_s16, vrshrq_x_n_s32, vrshrq_x_n_s8, vrshrq_x_n_u16, vrshrq_x_n_u32, vrshrq_x_n_u8, vshllbq_x_n_s16, vshllbq_x_n_s8, vshllbq_x_n_u16, vshllbq_x_n_u8, vshlltq_x_n_s16, vshlltq_x_n_s8, vshlltq_x_n_u16, vshlltq_x_n_u8, vshlq_x_n_s16, vshlq_x_n_s32, vshlq_x_n_s8, vshlq_x_n_u16, vshlq_x_n_u32, vshlq_x_n_u8, vshlq_x_s16, vshlq_x_s32, vshlq_x_s8, vshlq_x_u16, vshlq_x_u32, vshlq_x_u8, vshrq_x_n_s16, vshrq_x_n_s32, vshrq_x_n_s8, vshrq_x_n_u16, vshrq_x_n_u32, vshrq_x_n_u8, vsubq_x_f16, vsubq_x_f32, vsubq_x_n_f16, vsubq_x_n_f32, vsubq_x_n_s16, vsubq_x_n_s32, vsubq_x_n_s8, vsubq_x_n_u16, vsubq_x_n_u32, vsubq_x_n_u8, vsubq_x_s16, vsubq_x_s32, vsubq_x_s8, vsubq_x_u16, vsubq_x_u32, vsubq_x_u8. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni * config/arm/arm_mve.h (vddupq_x_n_u8): Define macro. (vddupq_x_n_u16): Likewise. (vddupq_x_n_u32): Likewise. (vddupq_x_wb_u8): Likewise. (vddupq_x_wb_u16): Likewise. (vddupq_x_wb_u32): Likewise. (vdwdupq_x_n_u8): Likewise. (vdwdupq_x_n_u16): Likewise. (vdwdupq_x_n_u32): Likewise. (vdwdupq_x_wb_u8): Likewise. (vdwdupq_x_wb_u16): Likewise. (vdwdupq_x_wb_u32): Likewise. (vidupq_x_n_u8): Likewise. (vidupq_x_n_u16): Likewise. (vidupq_x_n_u32): Likewise. (vidupq_x_wb_u8): Likewise. (vidupq_x_wb_u16): Likewise. (vidupq_x_wb_u32): Likewise. (viwdupq_x_n_u8): Likewise. (viwdupq_x_n_u16): Likewise. (viwdupq_x_n_u32): Likewise. (viwdupq_x_wb_u8): Likewise. (viwdupq_x_wb_u16): Likewise. (viwdupq_x_wb_u32): Likewise. (vdupq_x_n_s8): Likewise. (vdupq_x_n_s16): Likewise. (vdupq_x_n_s32): Likewise. (vdupq_x_n_u8): Likewise. (vdupq_x_n_u16): Likewise. (vdupq_x_n_u32): Likewise. (vminq_x_s8): Likewise. (vminq_x_s16): Likewise. (vminq_x_s32): Likewise. (vminq_x_u8): Likewise. (vminq_x_u16): Likewise. (vminq_x_u32): Likewise. (vmaxq_x_s8): Likewise. (vmaxq_x_s16): Likewise. (vmaxq_x_s32): Likewise. (vmaxq_x_u8): Likewise. (vmaxq_x_u16): Likewise. (vmaxq_x_u32): Likewise. (vabdq_x_s8): Likewise. (vabdq_x_s16): Likewise. (vabdq_x_s32): Likewise. (vabdq_x_u8): Likewise. (vabdq_x_u16): Likewise. (vabdq_x_u32): Likewise. (vabsq_x_s8): Likewise. (vabsq_x_s16): Likewise. (vabsq_x_s32): Likewise. (vaddq_x_s8): Likewise. (vaddq_x_s16): Likewise. (vaddq_x_s32): Likewise. (vaddq_x_n_s8): Likewise. (vaddq_x_n_s16): Likewise. (vaddq_x_n_s32): Likewise. (vaddq_x_u8): Likewise. (vaddq_x_u16): Likewise. (vaddq_x_u32): Likewise. (vaddq_x_n_u8): Likewise. (vaddq_x_n_u16): Likewise. (vaddq_x_n_u32): Likewise. (vclsq_x_s8): Likewise. (vclsq_x_s16): Likewise. (vclsq_x_s32): Likewise. (vclzq_x_s8): Likewise. (vclzq_x_s16): Likewise. (vclzq_x_s32): Likewise. (vclzq_x_u8): Likewise. (vclzq_x_u16): Likewise. (vclzq_x_u32): Likewise. (vnegq_x_s8): Likewise. (vnegq_x_s16): Likewise. (vnegq_x_s32): Likewise. (vmulhq_x_s8): Likewise. (vmulhq_x_s16): Likewise. (vmulhq_x_s32): Likewise. (vmulhq_x_u8): Likewise. (vmulhq_x_u16): Likewise. (vmulhq_x_u32): Likewise. (vmullbq_poly_x_p8): Likewise. (vmullbq_poly_x_p16): Likewise. (vmullbq_int_x_s8): Likewise. (vmullbq_int_x_s16): Likewise. (vmullbq_int_x_s32): Likewise. (vmullbq_int_x_u8): Likewise. (vmullbq_int_x_u16): Likewise. (vmullbq_int_x_u32): Likewise. (vmulltq_poly_x_p8): Likewise. (vmulltq_poly_x_p16): Likewise. (vmulltq_int_x_s8): Likewise. (vmulltq_int_x_s16): Likewise. (vmulltq_int_x_s32): Likewise. (vmulltq_int_x_u8): Likewise. (vmulltq_int_x_u16): Likewise. (vmulltq_int_x_u32): Likewise. (vmulq_x_s8): Likewise. (vmulq_x_s16
[PATCH v3][ARM][GCC][9x]: MVE ACLE predicated intrinsics with (dont-care) variant.
, vrev64q_x_s8, vrev64q_x_u16, vrev64q_x_u32, vrev64q_x_u8, vrhaddq_x_s16, vrhaddq_x_s32, vrhaddq_x_s8, vrhaddq_x_u16, vrhaddq_x_u32, vrhaddq_x_u8, vrmulhq_x_s16, vrmulhq_x_s32, vrmulhq_x_s8, vrmulhq_x_u16, vrmulhq_x_u32, vrmulhq_x_u8, vrndaq_x_f16, vrndaq_x_f32, vrndmq_x_f16, vrndmq_x_f32, vrndnq_x_f16, vrndnq_x_f32, vrndpq_x_f16, vrndpq_x_f32, vrndq_x_f16, vrndq_x_f32, vrndxq_x_f16, vrndxq_x_f32, vrshlq_x_s16, vrshlq_x_s32, vrshlq_x_s8, vrshlq_x_u16, vrshlq_x_u32, vrshlq_x_u8, vrshrq_x_n_s16, vrshrq_x_n_s32, vrshrq_x_n_s8, vrshrq_x_n_u16, vrshrq_x_n_u32, vrshrq_x_n_u8, vshllbq_x_n_s16, vshllbq_x_n_s8, vshllbq_x_n_u16, vshllbq_x_n_u8, vshlltq_x_n_s16, vshlltq_x_n_s8, vshlltq_x_n_u16, vshlltq_x_n_u8, vshlq_x_n_s16, vshlq_x_n_s32, vshlq_x_n_s8, vshlq_x_n_u16, vshlq_x_n_u32, vshlq_x_n_u8, vshlq_x_s16, vshlq_x_s32, vshlq_x_s8, vshlq_x_u16, vshlq_x_u32, vshlq_x_u8, vshrq_x_n_s16, vshrq_x_n_s32, vshrq_x_n_s8, vshrq_x_n_u16, vshrq_x_n_u32, vshrq_x_n_u8, vsubq_x_f16, vsubq_x_f32, vsubq_x_n_f16, vsubq_x_n_f32, vsubq_x_n_s16, vsubq_x_n_s32, vsubq_x_n_s8, vsubq_x_n_u16, vsubq_x_n_u32, vsubq_x_n_u8, vsubq_x_s16, vsubq_x_s32, vsubq_x_s8, vsubq_x_u16, vsubq_x_u32, vsubq_x_u8. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni * config/arm/arm_mve.h (vddupq_x_n_u8): Define macro. (vddupq_x_n_u16): Likewise. (vddupq_x_n_u32): Likewise. (vddupq_x_wb_u8): Likewise. (vddupq_x_wb_u16): Likewise. (vddupq_x_wb_u32): Likewise. (vdwdupq_x_n_u8): Likewise. (vdwdupq_x_n_u16): Likewise. (vdwdupq_x_n_u32): Likewise. (vdwdupq_x_wb_u8): Likewise. (vdwdupq_x_wb_u16): Likewise. (vdwdupq_x_wb_u32): Likewise. (vidupq_x_n_u8): Likewise. (vidupq_x_n_u16): Likewise. (vidupq_x_n_u32): Likewise. (vidupq_x_wb_u8): Likewise. (vidupq_x_wb_u16): Likewise. (vidupq_x_wb_u32): Likewise. (viwdupq_x_n_u8): Likewise. (viwdupq_x_n_u16): Likewise. (viwdupq_x_n_u32): Likewise. (viwdupq_x_wb_u8): Likewise. (viwdupq_x_wb_u16): Likewise. (viwdupq_x_wb_u32): Likewise. (vdupq_x_n_s8): Likewise. (vdupq_x_n_s16): Likewise. (vdupq_x_n_s32): Likewise. (vdupq_x_n_u8): Likewise. (vdupq_x_n_u16): Likewise. (vdupq_x_n_u32): Likewise. (vminq_x_s8): Likewise. (vminq_x_s16): Likewise. (vminq_x_s32): Likewise. (vminq_x_u8): Likewise. (vminq_x_u16): Likewise. (vminq_x_u32): Likewise. (vmaxq_x_s8): Likewise. (vmaxq_x_s16): Likewise. (vmaxq_x_s32): Likewise. (vmaxq_x_u8): Likewise. (vmaxq_x_u16): Likewise. (vmaxq_x_u32): Likewise. (vabdq_x_s8): Likewise. (vabdq_x_s16): Likewise. (vabdq_x_s32): Likewise. (vabdq_x_u8): Likewise. (vabdq_x_u16): Likewise. (vabdq_x_u32): Likewise. (vabsq_x_s8): Likewise. (vabsq_x_s16): Likewise. (vabsq_x_s32): Likewise. (vaddq_x_s8): Likewise. (vaddq_x_s16): Likewise. (vaddq_x_s32): Likewise. (vaddq_x_n_s8): Likewise. (vaddq_x_n_s16): Likewise. (vaddq_x_n_s32): Likewise. (vaddq_x_u8): Likewise. (vaddq_x_u16): Likewise. (vaddq_x_u32): Likewise. (vaddq_x_n_u8): Likewise. (vaddq_x_n_u16): Likewise. (vaddq_x_n_u32): Likewise. (vclsq_x_s8): Likewise. (vclsq_x_s16): Likewise. (vclsq_x_s32): Likewise. (vclzq_x_s8): Likewise. (vclzq_x_s16): Likewise. (vclzq_x_s32): Likewise. (vclzq_x_u8): Likewise. (vclzq_x_u16): Likewise. (vclzq_x_u32): Likewise. (vnegq_x_s8): Likewise. (vnegq_x_s16): Likewise. (vnegq_x_s32): Likewise. (vmulhq_x_s8): Likewise. (vmulhq_x_s16): Likewise. (vmulhq_x_s32): Likewise. (vmulhq_x_u8): Likewise. (vmulhq_x_u16): Likewise. (vmulhq_x_u32): Likewise. (vmullbq_poly_x_p8): Likewise. (vmullbq_poly_x_p16): Likewise. (vmullbq_int_x_s8): Likewise. (vmullbq_int_x_s16): Likewise. (vmullbq_int_x_s32): Likewise. (vmullbq_int_x_u8): Likewise. (vmullbq_int_x_u16): Likewise. (vmullbq_int_x_u32): Likewise. (vmulltq_poly_x_p8): Likewise. (vmulltq_poly_x_p16): Likewise. (vmulltq_int_x_s8): Likewise. (vmulltq_int_x_s16): Likewise. (vmulltq_int_x_s32): Likewise. (vmulltq_int_x_u8): Likewise. (vmulltq_int_x_u16): Likewise. (vmulltq_int_x_u32): Likewise. (vmulq_x_s8): Likewise. (vmulq_x_s16
[PATCH v2][ARM][GCC][10x]: MVE ACLE intrinsics "add with carry across beats" and "beat-wise substract".
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534348.html Hello, This patch supports following MVE ACLE "add with carry across beats" intrinsics and "beat-wise substract" intrinsics. vadciq_s32, vadciq_u32, vadciq_m_s32, vadciq_m_u32, vadcq_s32, vadcq_u32, vadcq_m_s32, vadcq_m_u32, vsbciq_s32, vsbciq_u32, vsbciq_m_s32, vsbciq_m_u32, vsbcq_s32, vsbcq_u32, vsbcq_m_s32, vsbcq_m_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm-builtins.c (ARM_BUILTIN_GET_FPSCR_NZCVQC): Define. (ARM_BUILTIN_SET_FPSCR_NZCVQC): Likewise. (arm_init_mve_builtins): Add "__builtin_arm_get_fpscr_nzcvqc" and "__builtin_arm_set_fpscr_nzcvqc" to arm_builtin_decls array. (arm_expand_builtin): Define case ARM_BUILTIN_GET_FPSCR_NZCVQC and ARM_BUILTIN_SET_FPSCR_NZCVQC. * config/arm/arm_mve.h (vadciq_s32): Define macro. (vadciq_u32): Likewise. (vadciq_m_s32): Likewise. (vadciq_m_u32): Likewise. (vadcq_s32): Likewise. (vadcq_u32): Likewise. (vadcq_m_s32): Likewise. (vadcq_m_u32): Likewise. (vsbciq_s32): Likewise. (vsbciq_u32): Likewise. (vsbciq_m_s32): Likewise. (vsbciq_m_u32): Likewise. (vsbcq_s32): Likewise. (vsbcq_u32): Likewise. (vsbcq_m_s32): Likewise. (vsbcq_m_u32): Likewise. (__arm_vadciq_s32): Define intrinsic. (__arm_vadciq_u32): Likewise. (__arm_vadciq_m_s32): Likewise. (__arm_vadciq_m_u32): Likewise. (__arm_vadcq_s32): Likewise. (__arm_vadcq_u32): Likewise. (__arm_vadcq_m_s32): Likewise. (__arm_vadcq_m_u32): Likewise. (__arm_vsbciq_s32): Likewise. (__arm_vsbciq_u32): Likewise. (__arm_vsbciq_m_s32): Likewise. (__arm_vsbciq_m_u32): Likewise. (__arm_vsbcq_s32): Likewise. (__arm_vsbcq_u32): Likewise. (__arm_vsbcq_m_s32): Likewise. (__arm_vsbcq_m_u32): Likewise. (vadciq_m): Define polymorphic variant. (vadciq): Likewise. (vadcq_m): Likewise. (vadcq): Likewise. (vsbciq_m): Likewise. (vsbciq): Likewise. (vsbcq_m): Likewise. (vsbcq): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE): Use builtin qualifier. (BINOP_UNONE_UNONE_UNONE): Likewise. (QUADOP_NONE_NONE_NONE_NONE_UNONE): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE): Likewise. * config/arm/mve.md (VADCIQ): Define iterator. (VADCIQ_M): Likewise. (VSBCQ): Likewise. (VSBCQ_M): Likewise. (VSBCIQ): Likewise. (VSBCIQ_M): Likewise. (VADCQ): Likewise. (VADCQ_M): Likewise. (mve_vadciq_m_v4si): Define RTL pattern. (mve_vadciq_v4si): Likewise. (mve_vadcq_m_v4si): Likewise. (mve_vadcq_v4si): Likewise. (mve_vsbciq_m_v4si): Likewise. (mve_vsbciq_v4si): Likewise. (mve_vsbcq_m_v4si): Likewise. (mve_vsbcq_v4si): Likewise. (get_fpscr_nzcvqc): Define isns. (set_fpscr_nzcvqc): Define isns. * config/arm/unspecs.md (UNSPEC_GET_FPSCR_NZCVQC): Define. (UNSPEC_SET_FPSCR_NZCVQC): Define. gcc/testsuite/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: New test. * gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadciq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadciq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins
[PATCH v2][ARM][GCC][11x]: MVE ACLE vector interleaving store and deinterleaving load intrinsics and also aliases to vstr and vldr intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534347.html Hello, This patch supports following MVE ACLE intrinsics which are aliases of vstr and vldr intrinsics. vst1q_p_u8, vst1q_p_s8, vld1q_z_u8, vld1q_z_s8, vst1q_p_u16, vst1q_p_s16, vld1q_z_u16, vld1q_z_s16, vst1q_p_u32, vst1q_p_s32, vld1q_z_u32, vld1q_z_s32, vld1q_z_f16, vst1q_p_f16, vld1q_z_f32, vst1q_p_f32. This patch also supports following MVE ACLE vector deinterleaving loads and vector interleaving stores. vst2q_s8, vst2q_u8, vld2q_s8, vld2q_u8, vld4q_s8, vld4q_u8, vst2q_s16, vst2q_u16, vld2q_s16, vld2q_u16, vld4q_s16, vld4q_u16, vst2q_s32, vst2q_u32, vld2q_s32, vld2q_u32, vld4q_s32, vld4q_u32, vld4q_f16, vld2q_f16, vst2q_f16, vld4q_f32, vld2q_f32, vst2q_f32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm_mve.h (vst1q_p_u8): Define macro. (vst1q_p_s8): Likewise. (vst2q_s8): Likewise. (vst2q_u8): Likewise. (vld1q_z_u8): Likewise. (vld1q_z_s8): Likewise. (vld2q_s8): Likewise. (vld2q_u8): Likewise. (vld4q_s8): Likewise. (vld4q_u8): Likewise. (vst1q_p_u16): Likewise. (vst1q_p_s16): Likewise. (vst2q_s16): Likewise. (vst2q_u16): Likewise. (vld1q_z_u16): Likewise. (vld1q_z_s16): Likewise. (vld2q_s16): Likewise. (vld2q_u16): Likewise. (vld4q_s16): Likewise. (vld4q_u16): Likewise. (vst1q_p_u32): Likewise. (vst1q_p_s32): Likewise. (vst2q_s32): Likewise. (vst2q_u32): Likewise. (vld1q_z_u32): Likewise. (vld1q_z_s32): Likewise. (vld2q_s32): Likewise. (vld2q_u32): Likewise. (vld4q_s32): Likewise. (vld4q_u32): Likewise. (vld4q_f16): Likewise. (vld2q_f16): Likewise. (vld1q_z_f16): Likewise. (vst2q_f16): Likewise. (vst1q_p_f16): Likewise. (vld4q_f32): Likewise. (vld2q_f32): Likewise. (vld1q_z_f32): Likewise. (vst2q_f32): Likewise. (vst1q_p_f32): Likewise. (__arm_vst1q_p_u8): Define intrinsic. (__arm_vst1q_p_s8): Likewise. (__arm_vst2q_s8): Likewise. (__arm_vst2q_u8): Likewise. (__arm_vld1q_z_u8): Likewise. (__arm_vld1q_z_s8): Likewise. (__arm_vld2q_s8): Likewise. (__arm_vld2q_u8): Likewise. (__arm_vld4q_s8): Likewise. (__arm_vld4q_u8): Likewise. (__arm_vst1q_p_u16): Likewise. (__arm_vst1q_p_s16): Likewise. (__arm_vst2q_s16): Likewise. (__arm_vst2q_u16): Likewise. (__arm_vld1q_z_u16): Likewise. (__arm_vld1q_z_s16): Likewise. (__arm_vld2q_s16): Likewise. (__arm_vld2q_u16): Likewise. (__arm_vld4q_s16): Likewise. (__arm_vld4q_u16): Likewise. (__arm_vst1q_p_u32): Likewise. (__arm_vst1q_p_s32): Likewise. (__arm_vst2q_s32): Likewise. (__arm_vst2q_u32): Likewise. (__arm_vld1q_z_u32): Likewise. (__arm_vld1q_z_s32): Likewise. (__arm_vld2q_s32): Likewise. (__arm_vld2q_u32): Likewise. (__arm_vld4q_s32): Likewise. (__arm_vld4q_u32): Likewise. (__arm_vld4q_f16): Likewise. (__arm_vld2q_f16): Likewise. (__arm_vld1q_z_f16): Likewise. (__arm_vst2q_f16): Likewise. (__arm_vst1q_p_f16): Likewise. (__arm_vld4q_f32): Likewise. (__arm_vld2q_f32): Likewise. (__arm_vld1q_z_f32): Likewise. (__arm_vst2q_f32): Likewise. (__arm_vst1q_p_f32): Likewise. (vld1q_z): Define polymorphic variant. (vld2q): Likewise. (vld4q): Likewise. (vst1q_p): Likewise. (vst2q): Likewise. * config/arm/arm_mve_builtins.def (STORE1): Use builtin qualifier. (LOAD1): Likewise. * config/arm/mve.md (mve_vst2q): Define RTL pattern. (mve_vld2q): Likewise. (mve_vld4q): Likewise. gcc/testsuite/ChangeLog: 2020-03-20 Srinath Parvathaneni Andre Vieira Mihail Ionescu * gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: New test. * gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_u32.c: Likewise. * gcc.target/arm/mve
[PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534346.html Hello, This patch supports following MVE ACLE intrinsics to get and set vector lane. vsetq_lane_f16, vsetq_lane_f32, vsetq_lane_s16, vsetq_lane_s32, vsetq_lane_s8, vsetq_lane_s64, vsetq_lane_u8, vsetq_lane_u16, vsetq_lane_u32, vsetq_lane_u64, vgetq_lane_f16, vgetq_lane_f32, vgetq_lane_s16, vgetq_lane_s32, vgetq_lane_s8, vgetq_lane_s64, vgetq_lane_u8, vgetq_lane_u16, vgetq_lane_u32, vgetq_lane_u64. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-08 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm_mve.h (vsetq_lane_f16): Define macro. (vsetq_lane_f32): Likewise. (vsetq_lane_s16): Likewise. (vsetq_lane_s32): Likewise. (vsetq_lane_s8): Likewise. (vsetq_lane_s64): Likewise. (vsetq_lane_u8): Likewise. (vsetq_lane_u16): Likewise. (vsetq_lane_u32): Likewise. (vsetq_lane_u64): Likewise. (vgetq_lane_f16): Likewise. (vgetq_lane_f32): Likewise. (vgetq_lane_s16): Likewise. (vgetq_lane_s32): Likewise. (vgetq_lane_s8): Likewise. (vgetq_lane_s64): Likewise. (vgetq_lane_u8): Likewise. (vgetq_lane_u16): Likewise. (vgetq_lane_u32): Likewise. (vgetq_lane_u64): Likewise. (__ARM_NUM_LANES): Likewise. (__ARM_LANEQ): Likewise. (__ARM_CHECK_LANEQ): Likewise. (__arm_vsetq_lane_s16): Define intrinsic. (__arm_vsetq_lane_s32): Likewise. (__arm_vsetq_lane_s8): Likewise. (__arm_vsetq_lane_s64): Likewise. (__arm_vsetq_lane_u8): Likewise. (__arm_vsetq_lane_u16): Likewise. (__arm_vsetq_lane_u32): Likewise. (__arm_vsetq_lane_u64): Likewise. (__arm_vgetq_lane_s16): Likewise. (__arm_vgetq_lane_s32): Likewise. (__arm_vgetq_lane_s8): Likewise. (__arm_vgetq_lane_s64): Likewise. (__arm_vgetq_lane_u8): Likewise. (__arm_vgetq_lane_u16): Likewise. (__arm_vgetq_lane_u32): Likewise. (__arm_vgetq_lane_u64): Likewise. (__arm_vsetq_lane_f16): Likewise. (__arm_vsetq_lane_f32): Likewise. (__arm_vgetq_lane_f16): Likewise. (__arm_vgetq_lane_f32): Likewise. (vgetq_lane): Define polymorphic variant. (vsetq_lane): Likewise. * config/arm/mve.md (mve_vec_extract): Define RTL pattern. (mve_vec_extractv2didi): Likewise. (mve_vec_extract_sext_internal): Likewise. (mve_vec_extract_zext_internal): Likewise. (mve_vec_set_internal): Likewise. (mve_vec_setv2di_internal): Likewise. * config/arm/neon.md (vec_set): Move RTL pattern to vec-common.md file. (vec_extract): Rename to "neon_vec_extract". (vec_extractv2didi): Rename to "neon_vec_extractv2didi". * config/arm/vec-common.md (vec_extract): Define RTL pattern common for MVE and NEON. (vec_set): Move RTL pattern from neon.md and modify to accept both MVE and NEON. gcc/testsuite/ChangeLog: 2019-11-08 Srinath Parvathaneni Andre Vieira Mihail Ionescu * gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: New test. * gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise. ### Attachment also inlined for ease of reply##
[PATCH v2][ARM][GCC][13x]: MVE ACLE scalar shift intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534350.html Hello, This patch supports following MVE ACLE scalar shift intrinsics. sqrshr, sqrshrl, sqrshrl_sat48, sqshl, sqshll, srshr, srshrl, uqrshl, uqrshll, uqrshll_sat48, uqshl, uqshll, urshr, urshrl, lsll, asrl. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-08 Srinath Parvathaneni * config/arm/arm-builtins.c (LSLL_QUALIFIERS): Define builtin qualifier. (UQSHL_QUALIFIERS): Likewise. (ASRL_QUALIFIERS): Likewise. (SQSHL_QUALIFIERS): Likewise. * config/arm/arm_mve.h (__ARM_BIG_ENDIAN): Check to not support MVE in Big-Endian Mode. (sqrshr): Define macro. (sqrshrl): Likewise. (sqrshrl_sat48): Likewise. (sqshl): Likewise. (sqshll): Likewise. (srshr): Likewise. (srshrl): Likewise. (uqrshl): Likewise. (uqrshll): Likewise. (uqrshll_sat48): Likewise. (uqshl): Likewise. (uqshll): Likewise. (urshr): Likewise. (urshrl): Likewise. (lsll): Likewise. (asrl): Likewise. (__arm_lsll): Define intrinsic. (__arm_asrl): Likewise. (__arm_uqrshll): Likewise. (__arm_uqrshll_sat48): Likewise. (__arm_sqrshrl): Likewise. (__arm_sqrshrl_sat48): Likewise. (__arm_uqshll): Likewise. (__arm_urshrl): Likewise. (__arm_srshrl): Likewise. (__arm_sqshll): Likewise. (__arm_uqrshl): Likewise. (__arm_sqrshr): Likewise. (__arm_uqshl): Likewise. (__arm_urshr): Likewise. (__arm_sqshl): Likewise. (__arm_srshr): Likewise. * config/arm/arm_mve_builtins.def (LSLL_QUALIFIERS): Use builtin qualifier. (UQSHL_QUALIFIERS): Likewise. (ASRL_QUALIFIERS): Likewise. (SQSHL_QUALIFIERS): Likewise. * config/arm/mve.md (mve_uqrshll_sat_di): Define RTL pattern. (mve_sqrshrl_sat_di): Likewise (mve_uqrshl_si): Likewise (mve_sqrshr_si): Likewise (mve_uqshll_di): Likewise (mve_urshrl_di): Likewise (mve_uqshl_si): Likewise (mve_urshr_si): Likewise (mve_sqshl_si): Likewise (mve_srshr_si): Likewise (mve_srshrl_di): Likewise (mve_sqshll_di): Likewise gcc/testsuite/ChangeLog: 2019-11-08 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/asrl.c: New test. * gcc.target/arm/mve/intrinsics/lsll.c: Likewise. * gcc.target/arm/mve/intrinsics/sqrshr.c: Likewise. * gcc.target/arm/mve/intrinsics/sqrshrl_sat48.c: Likewise. * gcc.target/arm/mve/intrinsics/sqrshrl_sat64.c: Likewise. * gcc.target/arm/mve/intrinsics/sqshl.c: Likewise. * gcc.target/arm/mve/intrinsics/sqshll.c: Likewise. * gcc.target/arm/mve/intrinsics/srshr.c: Likewise. * gcc.target/arm/mve/intrinsics/srshrl.c: Likewise. * gcc.target/arm/mve/intrinsics/uqrshl.c: Likewise. * gcc.target/arm/mve/intrinsics/uqrshll_sat48.c: Likewise. * gcc.target/arm/mve/intrinsics/uqrshll_sat64.c: Likewise. * gcc.target/arm/mve/intrinsics/uqshl.c: Likewise. * gcc.target/arm/mve/intrinsics/uqshll.c: Likewise. * gcc.target/arm/mve/intrinsics/urshr.c: Likewise. * gcc.target/arm/mve/intrinsics/urshrl.c: Likewise. * lib/target-supports.exp: (check_effective_target_arm_v8_1m_mve_fp_ok_nocache): Modify to not support MVE floating point in Big Endian mode. (check_effective_target_arm_v8_1m_mve_ok_nocache): Modify to not support MVE integer in Big Endian mode. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 96d8adcd37eb3caf51c71d66af0331f9d1924b92..56f0db21ea95dcd738877daba27f1cb60f0d5a32 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -762,6 +762,26 @@ arm_strsbwbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS] qualifier_unsigned, qualifier_unsigned}; #define STRSBWBU_P_QUALIFIERS (arm_strsbwbu_p_qualifiers) +static enum arm_type_qualifiers +arm_lsll_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_none}; +#define LSLL_QUALIFIERS (arm_lsll_qualifiers) + +static enum arm_type_qualifiers +arm_uqshl_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_const}; +#define UQSHL_QUALIFIERS (arm_uqshl_qualifiers) + +static enum arm_type_qualifiers +arm_asrl_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none
[PATCH v2][ARM][GCC][14x]: MVE ACLE whole vector left shift with carry intrinsics.
Hello Kyrill, Following patch is the rebased version of v1. (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534344.html Hello, This patch supports following MVE ACLE whole vector left shift with carry intrinsics. vshlcq_m_s8, vshlcq_m_s16, vshlcq_m_s32, vshlcq_m_u8, vshlcq_m_u16, vshlcq_m_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-08 Srinath Parvathaneni Andre Vieira Mihail Ionescu * config/arm/arm_mve.h (vshlcq_m_s8): Define macro. (vshlcq_m_u8): Likewise. (vshlcq_m_s16): Likewise. (vshlcq_m_u16): Likewise. (vshlcq_m_s32): Likewise. (vshlcq_m_u32): Likewise. (__arm_vshlcq_m_s8): Define intrinsic. (__arm_vshlcq_m_u8): Likewise. (__arm_vshlcq_m_s16): Likewise. (__arm_vshlcq_m_u16): Likewise. (__arm_vshlcq_m_s32): Likewise. (__arm_vshlcq_m_u32): Likewise. (vshlcq_m): Define polymorphic variant. * config/arm/arm_mve_builtins.def (QUADOP_NONE_NONE_UNONE_IMM_UNONE): Use builtin qualifier. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE): Likewise. * config/arm/mve.md (mve_vshlcq_m_vec_): Define RTL pattern. (mve_vshlcq_m_carry_): Likewise. (mve_vshlcq_m_): Likewise. gcc/testsuite/ChangeLog: 2019-11-08 Srinath Parvathaneni Andre Vieira Mihail Ionescu * gcc.target/arm/mve/intrinsics/vshlcq_m_s16.c: New test. * gcc.target/arm/mve/intrinsics/vshlcq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlcq_m_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlcq_m_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlcq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlcq_m_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index f2d80ee636003ac58f70ddc25db15e129e228906..14b6ec857bffd85b67c781f554faffde1b2abc6b 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -2546,6 +2546,12 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define urshrl(__p0, __p1) __arm_urshrl(__p0, __p1) #define lsll(__p0, __p1) __arm_lsll(__p0, __p1) #define asrl(__p0, __p1) __arm_asrl(__p0, __p1) +#define vshlcq_m_s8(__a, __b, __imm, __p) __arm_vshlcq_m_s8(__a, __b, __imm, __p) +#define vshlcq_m_u8(__a, __b, __imm, __p) __arm_vshlcq_m_u8(__a, __b, __imm, __p) +#define vshlcq_m_s16(__a, __b, __imm, __p) __arm_vshlcq_m_s16(__a, __b, __imm, __p) +#define vshlcq_m_u16(__a, __b, __imm, __p) __arm_vshlcq_m_u16(__a, __b, __imm, __p) +#define vshlcq_m_s32(__a, __b, __imm, __p) __arm_vshlcq_m_s32(__a, __b, __imm, __p) +#define vshlcq_m_u32(__a, __b, __imm, __p) __arm_vshlcq_m_u32(__a, __b, __imm, __p) #endif /* For big-endian, GCC's vector indices are reversed within each 64 bits @@ -16671,6 +16677,60 @@ __arm_srshr (int32_t value, const int shift) return __builtin_mve_srshr_si (value, shift); } +__extension__ extern __inline int8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vshlcq_m_s8 (int8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) +{ + int8x16_t __res = __builtin_mve_vshlcq_m_vec_sv16qi (__a, *__b, __imm, __p); + *__b = __builtin_mve_vshlcq_m_carry_sv16qi (__a, *__b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint8x16_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vshlcq_m_u8 (uint8x16_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) +{ + uint8x16_t __res = __builtin_mve_vshlcq_m_vec_uv16qi (__a, *__b, __imm, __p); + *__b = __builtin_mve_vshlcq_m_carry_uv16qi (__a, *__b, __imm, __p); + return __res; +} + +__extension__ extern __inline int16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vshlcq_m_s16 (int16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) +{ + int16x8_t __res = __builtin_mve_vshlcq_m_vec_sv8hi (__a, *__b, __imm, __p); + *__b = __builtin_mve_vshlcq_m_carry_sv8hi (__a, *__b, __imm, __p); + return __res; +} + +__extension__ extern __inline uint16x8_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vshlcq_m_u16 (uint16x8_t __a, uint32_t * __b, const int __imm, mve_pred16_t __p) +{ + uint16x8_t __res = __builtin_mve_vshlcq_m_vec_uv8hi (__a, *__b, __imm, __p); + *__b = __builtin_mve_vshlcq_m_carry_uv8hi (__a, *__b, __imm, __p); + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vshlcq_m_s32 (int32x4_t __a, uint32_t
[GCC][ARM][PATCH]: Add support for MVE ACLE intrinsics polymorphic variants for +mve.fp option.
) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-24 Srinath Parvathaneni * config/arm/arm_mve.h (vaddlvq): Move the polymorphic variant to the common section of both MVE Integer and MVE Floating Point. (vaddvq): Likewise. (vaddlvq_p): Likewise. (vaddvaq): Likewise. (vaddvq_p): Likewise. (vcmpcsq): Likewise. (vmlsdavxq): Likewise. (vmlsdavq): Likewise. (vmladavxq): Likewise. (vmladavq): Likewise. (vminvq): Likewise. (vminavq): Likewise. (vmaxvq): Likewise. (vmaxavq): Likewise. (vmlaldavq): Likewise. (vcmphiq): Likewise. (vaddlvaq): Likewise. (vrmlaldavhq): Likewise. (vrmlaldavhxq): Likewise. (vrmlsldavhq): Likewise. (vrmlsldavhxq): Likewise. (vmlsldavxq): Likewise. (vmlsldavq): Likewise. (vabavq): Likewise. (vrmlaldavhaq): Likewise. (vcmpgeq_m_n): Likewise. (vmlsdavxq_p): Likewise. (vmlsdavq_p): Likewise. (vmlsdavaxq): Likewise. (vmlsdavaq): Likewise. (vaddvaq_p): Likewise. (vcmpcsq_m_n): Likewise. (vcmpcsq_m): Likewise. (vmladavxq_p): Likewise. (vmladavq_p): Likewise. (vmladavaxq): Likewise. (vmladavaq): Likewise. (vminvq_p): Likewise. (vminavq_p): Likewise. (vmaxvq_p): Likewise. (vmaxavq_p): Likewise. (vcmphiq_m): Likewise. (vaddlvaq_p): Likewise. (vmlaldavaq): Likewise. (vmlaldavaxq): Likewise. (vmlaldavq_p): Likewise. (vmlaldavxq_p): Likewise. (vmlsldavaq): Likewise. (vmlsldavaxq): Likewise. (vmlsldavq_p): Likewise. (vmlsldavxq_p): Likewise. (vrmlaldavhaxq): Likewise. (vrmlaldavhq_p): Likewise. (vrmlaldavhxq_p): Likewise. (vrmlsldavhaq): Likewise. (vrmlsldavhaxq): Likewise. (vrmlsldavhq_p): Likewise. (vrmlsldavhxq_p): Likewise. (vabavq_p): Likewise. (vmladavaq_p): Likewise. (vstrbq_scatter_offset): Likewise. (vstrbq_p): Likewise. (vstrbq_scatter_offset_p): Likewise. (vstrdq_scatter_base_p): Likewise. (vstrdq_scatter_base): Likewise. (vstrdq_scatter_offset_p): Likewise. (vstrdq_scatter_offset): Likewise. (vstrdq_scatter_shifted_offset_p): Likewise. (vstrdq_scatter_shifted_offset): Likewise. (vmaxq_x): Likewise. (vminq_x): Likewise. (vmovlbq_x): Likewise. (vmovltq_x): Likewise. (vmulhq_x): Likewise. (vmullbq_int_x): Likewise. (vmullbq_poly_x): Likewise. (vmulltq_int_x): Likewise. (vmulltq_poly_x): Likewise. (vstrbq): Likewise. gcc/testsuite/ChangeLog: 2020-03-24 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vcmpcsq_m_n_u16.c: Modify * gcc.target/arm/mve/intrinsics/vcmpcsq_m_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpcsq_m_n_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c: Likewise. rb12837.patch.gz Description: application/gzip
[GCC][ARM][PATCH]: Add MVE ACLE intrinsics vbicq_n_* polymorphic variant support.
Hello, For the following MVE ACLE intrinsics, polymorphic variant support is missing on the trunk. vbicq_n_s16, vbicq_n_s32, vbicq_n_u16 and vbicq_n_u32. This patch add the polymorphic variant support for above intrinsics. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-30 Srinath Parvathaneni * config/arm/arm_mve.h (vbicq): Define MVE intrinsic polymorphic variant. (__arm_vbicq): Likewise. 2020-03-30 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vbicq_n_s16.c: Modify. * gcc.target/arm/mve/intrinsics/vbicq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 14b6ec857bffd85b67c781f554faffde1b2abc6b..d14ca24068e0bc5e75fa1917718a09a2ec3d9cbe 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -20704,6 +20704,10 @@ extern void *__ARM_undef; #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce1(__p1, int)), \ int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ @@ -24095,6 +24099,10 @@ extern void *__ARM_undef; #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce1(__p1, int)), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int32_t]: __arm_vbicq_n_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce1(__p1, int)), \ int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s16.c index 96b9699d6b166954f0d5900abc47c64e18416d0c..ecc48503fc2d2b16c9135c412df22f748874750c 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s16.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s16.c @@ -10,4 +10,10 @@ foo (int16x8_t a) return vbicq_n_s16 (a, 1); } -/* { dg-final { scan-assembler "vbic.i16" } } */ +int16x8_t +foo1 (int16x8_t a) +{ + return vbicq (a, 1); +} + +/* { dg-final { scan-assembler-times "vbic.i16" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s32.c index b262c835134b932968013b9abedb54c7ab901ae0..013cdf15cfd974ab82379d95e68b5a1bdf8936ca 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vbicq_n_s32.c @@ -10,4 +10,10 @@ foo (int32x4_t a) return vbicq_n_s32 (a, 1); } -/* { dg-final { scan-assembler "vbic.i32" } } */ +int32x4_t +foo1 (int32x4
[GCC][PATCH][ARM]: Fix for MVE ACLE intrinsics with writeback (PR94317).
Hello, Following MVE ACLE intrinsics have an issue with writeback to the base address. vldrdq_gather_base_wb_s64, vldrdq_gather_base_wb_u64, vldrdq_gather_base_wb_z_s64, vldrdq_gather_base_wb_z_u64, vldrwq_gather_base_wb_s32, vldrwq_gather_base_wb_u32, vldrwq_gather_base_wb_z_s32, vldrwq_gather_base_wb_z_u32, vldrwq_gather_base_wb_f32, vldrwq_gather_base_wb_z_f32. This patch fixes the bug reported in PR94317 by adding separate builtin calls to update the result and writeback to base address for the above intrinsics. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-31 Srinath Parvathaneni PR target/94317 * config/arm/arm-builtins.c (LDRGBWBXU_QUALIFIERS): Define. (LDRGBWBXU_Z_QUALIFIERS): Likewise. * config/arm/arm_mve.h (__arm_vldrdq_gather_base_wb_s64): Modify intrinsic defintion by adding a new builtin call to writeback into base address. (__arm_vldrdq_gather_base_wb_u64): Likewise. (__arm_vldrdq_gather_base_wb_z_s64): Likewise. (__arm_vldrdq_gather_base_wb_z_u64): Likewise. (__arm_vldrwq_gather_base_wb_s32): Likewise. (__arm_vldrwq_gather_base_wb_u32): Likewise. (__arm_vldrwq_gather_base_wb_z_s32): Likewise. (__arm_vldrwq_gather_base_wb_z_u32): Likewise. (__arm_vldrwq_gather_base_wb_f32): Likewise. (__arm_vldrwq_gather_base_wb_z_f32): Likewise. * config/arm/arm_mve_builtins.def (vldrwq_gather_base_wb_z_u): Modify builtin's qualifier. (vldrdq_gather_base_wb_z_u): Likewise. (vldrwq_gather_base_wb_u): Likewise. (vldrdq_gather_base_wb_u): Likewise. (vldrwq_gather_base_wb_z_s): Likewise. (vldrwq_gather_base_wb_z_f): Likewise. (vldrdq_gather_base_wb_z_s): Likewise. (vldrwq_gather_base_wb_s): Likewise. (vldrwq_gather_base_wb_f): Likewise. (vldrdq_gather_base_wb_s): Likewise. (vldrwq_gather_base_nowb_z_u): Define builtin. (vldrdq_gather_base_nowb_z_u): Likewise. (vldrwq_gather_base_nowb_u): Likewise. (vldrdq_gather_base_nowb_u): Likewise. (vldrwq_gather_base_nowb_z_s): Likewise. (vldrwq_gather_base_nowb_z_f): Likewise. (vldrdq_gather_base_nowb_z_s): Likewise. (vldrwq_gather_base_nowb_s): Likewise. (vldrwq_gather_base_nowb_f): Likewise. (vldrdq_gather_base_nowb_s): Likewise. * config/arm/mve.md (mve_vldrwq_gather_base_nowb_v4si): Define RTL pattern. (mve_vldrwq_gather_base_wb_v4si): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_z_v4si): Define RTL pattern. (mve_vldrwq_gather_base_wb_z_v4si): Modify RTL pattern. (mve_vldrwq_gather_base_wb_fv4sf): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_fv4sf): Define RTL pattern. (mve_vldrwq_gather_base_wb_z_fv4sf): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_z_fv4sf): Define RTL pattern. (mve_vldrdq_gather_base_nowb_v4di): Define RTL pattern. (mve_vldrdq_gather_base_wb_v4di): Modify RTL pattern. (mve_vldrdq_gather_base_nowb_z_v4di): Define RTL pattern. (mve_vldrdq_gather_base_wb_z_v4di): Modify RTL pattern. gcc/testsuite/ChangeLog: 2020-03-31 Srinath Parvathaneni PR target/94317 * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c: Modify * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 56f0db21ea95dcd738877daba27f1cb60f0d5a32..832b9107424fd9a4a0ee272b773b3d0929172370 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -719,6 +719,17 @@ arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] (arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers) static enum arm_type_qualifiers +arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate}; +#d
[GCC][PATCH][ARM]: Modify the MVE polymorphic variant arguments to match the MVE intrinsic definition.
Hello, When MVE intrinsic's are called, few implicit typecasting are done on the formal arguments to match the intrinsic parameters. But when same intrinsics are called through MVE polymorphic variants, _Generic feature used here does strict type checking and fails to match the exact intrinsic. This patch corrects the behaviour of polymorphic variants and match the expected intrinsic by explicitly typecasting the polymorphic variant's arguments. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-04-22 Srinath Parvathaneni * config/arm/arm_mve.h (__arm_vbicq_n_u16): Modify function parameter's datatype. (__arm_vbicq_n_s16): Likewise. (__arm_vbicq_n_u32): Likewise. (__arm_vbicq_n_s32): Likewise. (__arm_vbicq): Likewise. (__arm_vbicq_n_s16): Modify MVE polymorphic variant argument's datatype. (__arm_vbicq_n_s32): Likewise. (__arm_vbicq_n_u16): Likewise. (__arm_vbicq_n_u32): Likewise. (__arm_vdupq_m_n_s8): Likewise. (__arm_vdupq_m_n_s16): Likewise. (__arm_vdupq_m_n_s32): Likewise. (__arm_vdupq_m_n_u8): Likewise. (__arm_vdupq_m_n_u16): Likewise. (__arm_vdupq_m_n_u32): Likewise. (__arm_vdupq_m_n_f16): Likewise. (__arm_vdupq_m_n_f32): Likewise. (__arm_vldrhq_gather_offset_s16): Likewise. (__arm_vldrhq_gather_offset_s32): Likewise. (__arm_vldrhq_gather_offset_u16): Likewise. (__arm_vldrhq_gather_offset_u32): Likewise. (__arm_vldrhq_gather_offset_f16): Likewise. (__arm_vldrhq_gather_offset_z_s16): Likewise. (__arm_vldrhq_gather_offset_z_s32): Likewise. (__arm_vldrhq_gather_offset_z_u16): Likewise. (__arm_vldrhq_gather_offset_z_u32): Likewise. (__arm_vldrhq_gather_offset_z_f16): Likewise. (__arm_vldrhq_gather_shifted_offset_s16): Likewise. (__arm_vldrhq_gather_shifted_offset_s32): Likewise. (__arm_vldrhq_gather_shifted_offset_u16): Likewise. (__arm_vldrhq_gather_shifted_offset_u32): Likewise. (__arm_vldrhq_gather_shifted_offset_f16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_s16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_s32): Likewise. (__arm_vldrhq_gather_shifted_offset_z_u16): Likewise. (__arm_vldrhq_gather_shifted_offset_z_u32): Likewise. (__arm_vldrhq_gather_shifted_offset_z_f16): Likewise. (__arm_vldrwq_gather_offset_s32): Likewise. (__arm_vldrwq_gather_offset_u32): Likewise. (__arm_vldrwq_gather_offset_f32): Likewise. (__arm_vldrwq_gather_offset_z_s32): Likewise. (__arm_vldrwq_gather_offset_z_u32): Likewise. (__arm_vldrwq_gather_offset_z_f32): Likewise. (__arm_vldrwq_gather_shifted_offset_s32): Likewise. (__arm_vldrwq_gather_shifted_offset_u32): Likewise. (__arm_vldrwq_gather_shifted_offset_f32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_s32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_u32): Likewise. (__arm_vldrwq_gather_shifted_offset_z_f32): Likewise. (__arm_vdwdupq_x_n_u8): Likewise. (__arm_vdwdupq_x_n_u16): Likewise. (__arm_vdwdupq_x_n_u32): Likewise. (__arm_viwdupq_x_n_u8): Likewise. (__arm_viwdupq_x_n_u16): Likewise. (__arm_viwdupq_x_n_u32): Likewise. (__arm_vidupq_x_n_u8): Likewise. (__arm_vddupq_x_n_u8): Likewise. (__arm_vidupq_x_n_u16): Likewise. (__arm_vddupq_x_n_u16): Likewise. (__arm_vidupq_x_n_u32): Likewise. (__arm_vddupq_x_n_u32): Likewise. (__arm_vldrdq_gather_offset_s64): Likewise. (__arm_vldrdq_gather_offset_u64): Likewise. (__arm_vldrdq_gather_offset_z_s64): Likewise. (__arm_vldrdq_gather_offset_z_u64): Likewise. (__arm_vldrdq_gather_shifted_offset_s64): Likewise. (__arm_vldrdq_gather_shifted_offset_u64): Likewise. (__arm_vldrdq_gather_shifted_offset_z_s64): Likewise. (__arm_vldrdq_gather_shifted_offset_z_u64): Likewise. (__arm_vidupq_m_n_u8): Likewise. (__arm_vidupq_m_n_u16): Likewise. (__arm_vidupq_m_n_u32): Likewise. (__arm_vddupq_m_n_u8): Likewise. (__arm_vddupq_m_n_u16): Likewise. (__arm_vddupq_m_n_u32): Likewise. (__arm_vidupq_n_u16): Likewise. (__arm_vidupq_n_u32): Likewise. (__arm_vidupq_n_u8): Likewise. (__arm_vddupq_n_u16): Likewise. (__arm_vddupq_n_u32): Likewise. (__arm_vddupq_n_u8): Likewise. gcc/testsuite/ChangeLog: 2020-04-22 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/mve_vddupq_m_n_u16.c: New test.
[GCC][PATCH][ARM]: Change arm constraint name from "e" to "Te".
Hello, This patches changes the constraint "e" to "Te". Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-04-24 Srinath Parvathaneni * config/arm/constraints.md (e): Remove constraint. (Te): Define constraint. * config/arm/mve.md (vaddvq_): Modify constraint in operand 0 from "e" to "Te". (vaddvaq_): Likewise. (vaddvq_p_): Likewise. (vmladavq_): Likewise. (vmladavxq_s): Likewise. (vmlsdavq_s): Likewise. (vmlsdavxq_s): Likewise. (vaddvaq_p_): Likewise. (vmladavaq_): Likewise. (vmladavq_p_): Likewise. (vmladavxq_p_s): Likewise. (vmlsdavq_p_s): Likewise. (vmlsdavxq_p_s): Likewise. (vmlsdavaxq_s): Likewise. (vmlsdavaq_s): Likewise. (vmladavaxq_s): Likewise. (vmladavaq_p_): Likewise. (vmladavaxq_p_s): Likewise. (vmlsdavaq_p_s): Likewise. (vmlsdavaxq_p_s): Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index 41a85e2713b59f75f0b191b6be03a7ac8f6e321e..fed6c7c84032dd8aba45142b59b980b4a6240d6d 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -32,7 +32,7 @@ ;; The following multi-letter normal constraints have been used: ;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, DN, Dm, Dl, DL, Do, Dv, Dy, Di, -;; Dt, Dp, Dz, Tu +;; Dt, Dp, Dz, Tu, Te ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe ;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz, Rd, Rf, Rb, Ra, ;; Rg, Ri @@ -50,8 +50,8 @@ (define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS" "MVE FPCCR register") -(define_register_constraint "e" "TARGET_HAVE_MVE ? EVEN_REG : NO_REGS" - "MVE EVEN registers @code{r0}, @code{r2}, @code{r4}, @code{r6}, @code{r8}, +(define_register_constraint "Te" "TARGET_HAVE_MVE ? EVEN_REG : NO_REGS" + "EVEN core registers @code{r0}, @code{r2}, @code{r4}, @code{r6}, @code{r8}, @code{r10}, @code{r12}, @code{r14}") (define_constraint "Rd" diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 9cb18ef50e8320b2d7160346361343c4ab1685b9..f43dabbfd4f15b602f0627a9b0ea423064501e51 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1102,7 +1102,7 @@ ;; (define_insn "mve_vaddvq_" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w")] VADDVQ)) ] @@ -1477,7 +1477,7 @@ ;; (define_insn "mve_vaddvaq_" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:SI 1 "s_register_operand" "0") (match_operand:MVE_2 2 "s_register_operand" "w")] VADDVAQ)) @@ -1492,7 +1492,7 @@ ;; (define_insn "mve_vaddvq_p_" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operand:HI 2 "vpr_register_operand" "Up")] VADDVQ_P)) @@ -2032,7 +2032,7 @@ ;; (define_insn "mve_vmladavq_" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operand:MVE_2 2 "s_register_operand" "w")] VMLADAVQ)) @@ -2047,7 +2047,7 @@ ;; (define_insn "mve_vmladavxq_s" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operand:MVE_2 2 "s_register_operand" "w")] VMLADAVXQ_S)) @@ -2062,7 +2062,7 @@ ;; (define_insn "mve_vmlsdavq_s" [ - (set (match_operand:SI 0 "s_register_operand" "=e") + (set (match_operand:SI 0 "s_register_operand" "=Te") (unspec:SI [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operan
[GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Hello, Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler instructions generated by current compiler are wrong. eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`. But as per Arm-arm second argument in above instructions must also be a low register (<= r7). This patch fixes this issue by creating a new predicate "mve_memory_operand" and constraint "Ux" which allows low registers as arguments to the generated instructions depending on the mode of the argument. A new constraint "Ul" is created to handle loading to PC-relative addressing modes for vector store/load intrinsiscs. All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32 are modified in this patch. Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-M Architecture Reference Manual [2] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics [2] https://developer.arm.com/docs/ddi0553/latest Regression tested on arm-none-eabi and found no regressions. Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-05-13 Srinath Parvathaneni Andre Vieira PR target/94959 * config/arm/arm-protos.h (arm_mode_base_reg_class): Function declaration. (mve_vector_mem_operand): Likewise. * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check the load from memory to a core register is legitimate for give mode. (mve_vector_mem_operand): Define function. (arm_print_operand): Modify comment. (arm_mode_base_reg_class): Define. * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE. * config/arm/constraints.md (Ux): Likewise. (Ul): Likewise. * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also add support for missing Vector Store Register and Vector Load Register. Add a new alternative to support load from memory to PC (or label) in vector store/load. (mve_vstrbq_): Modify constraint Us to Ux. (mve_vldrbq_): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrbq_z_): Modify constraint Us to Ux. (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrhq_): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrhq_z_fv8hf): Likewise. (mve_vldrhq_z_): Likewise. (mve_vldrwq_fv4sf): Likewise. (mve_vldrwq_v4si): Likewise. (mve_vldrwq_z_fv4sf): Likewise. (mve_vldrwq_z_v4si): Likewise. (mve_vld1q_f): Modify constriant Us to Ux. (mve_vld1q_): Likewise. (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand. (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vstrhq_p_): Likewise. (mve_vstrhq_): Modify constriant Us to Ux, predicate to mve_memory_operand. (mve_vstrwq_fv4sf): Modify constriant Us to Ux. (mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE instructions to emit. (mve_vstrwq_p_v4si): Likewise. (mve_vstrwq_v4si): Likewise.Modify constriant Us to Ux. * config/arm/predicates.md (mve_memory_operand): Define. gcc/testsuite/ChangeLog: 2020-05-13 Srinath Parvathaneni PR target/94959 * gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Modify. * gcc.target/arm/mve/intrinsics/vld1q_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrbq_s8.c: Likewise. * gcc.ta
RE: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Hi, > -Original Message- > From: Christophe Lyon > Sent: 13 May 2020 11:20 > To: Srinath Parvathaneni > Cc: gcc Patches ; Richard Earnshaw > > Subject: Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE > vector load/store intrinsics (PR94959). > > Hi, > > > On Wed, 13 May 2020 at 11:47, Srinath Parvathaneni > wrote: > > > > Hello, > > > > Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler > instructions generated by current compiler are wrong. > > eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`. > > But as per Arm-arm second argument in above instructions must also be a > low register (<= r7). > > How did you catch this? Does gas complain? This was caught by running CMSIS-DSP testsuite and yes gas was complaining. > > If so, then do the existing tests pass (I mean before your patch)? All the existing tests are passing because the instructions generated in this tests always has the low registers ( <=R7) for second argument and that is because these tests are simple intrinsics calls. But modifying the testcase as mentioned in PR94959, surfaced the problem fixed by this patch. > > > This patch fixes this issue by creating a new predicate > "mve_memory_operand" and constraint "Ux" which allows low registers > > as arguments to the generated instructions depending on the mode of the > argument. > > A new constraint "Ul" is created to handle loading to PC-relative addressing > modes for vector store/load intrinsiscs. > > > > All the corresponding MVE intrinsic generating wrong code-gen as > vldrbq_s32 are modified in this patch. > > > > > > Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8- > M Architecture Reference Manual [2] for more details. > > [1] https://developer.arm.com/architectures/instruction-sets/simd- > isas/helium/mve-intrinsics > > [2] https://developer.arm.com/docs/ddi0553/latest > > > > Regression tested on arm-none-eabi and found no regressions. > > > > Ok for trunk? If ok, can I also back-port this patch to GCC-10 branch? > > > > Thanks, > > Srinath. > > > > gcc/ChangeLog: > > > > 2020-05-13 Srinath Parvathaneni > > Andre Vieira > > > > PR target/94959 > > * config/arm/arm-protos.h (arm_mode_base_reg_class): Function > > declaration. > > (mve_vector_mem_operand): Likewise. > > * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target > check > > the load from memory to a core register is legitimate for give mode. > > (mve_vector_mem_operand): Define function. > > (arm_print_operand): Modify comment. > > (arm_mode_base_reg_class): Define. > > * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check > for > > TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on > TRUE. > > * config/arm/constraints.md (Ux): Likewise. > > (Ul): Likewise. > > * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and > also > > add support for missing Vector Store Register and Vector Load > > Register. > > Add a new alternative to support load from memory to PC (or label) > > in > > vector store/load. > > (mve_vstrbq_): Modify constraint Us to Ux. > > (mve_vldrbq_): Modify constriant Us to Ux, predicate to > > mve_memory_operand and also modify the MVE instructions to emit. > > (mve_vldrbq_z_): Modify constraint Us to Ux. > > (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to > > mve_memory_operand and also modify the MVE instructions to emit. > > (mve_vldrhq_): Modify constriant Us to Ux, predicate to > > mve_memory_operand and also modify the MVE instructions to emit. > > (mve_vldrhq_z_fv8hf): Likewise. > > (mve_vldrhq_z_): Likewise. > > (mve_vldrwq_fv4sf): Likewise. > > (mve_vldrwq_v4si): Likewise. > > (mve_vldrwq_z_fv4sf): Likewise. > > (mve_vldrwq_z_v4si): Likewise. > > (mve_vld1q_f): Modify constriant Us to Ux. > > (mve_vld1q_): Likewise. > > (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to > > mve_memory_operand. > > (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to > > mve_memory_operand and also modify the MVE instructions to emit. > > (mve_vstrhq_p_): Likewise. > > (mve_vs
[ARM][wwwdocs]: Document Armv8.1-M, Helium Intrinsics and Cortex-M55 changes.
M-profile related changes in GCC-10. ### Attachment also inlined for ease of reply### diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index d1a7df0a9259292d097c1c3b9daeab56329ea435..57ca749da72ed64da37b3eb5404cf5cde8be44dd 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -717,6 +717,7 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); Arm Cortex-A77 (cortex-a77). Arm Cortex-A76AE (cortex-a76ae). Arm Cortex-M35P (cortex-m35p). + Arm Cortex-M55 (cortex-m55). The GCC identifiers can be used as arguments to the -mcpu or -mtune options, @@ -733,6 +734,16 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); added: this M-profile feature is no longer restricted to targets with MOVT. For example, -mcpu=cortex-m0 now supports this option. + Support for the + https://developer.arm.com/architectures/cpu-architecture/m-profile";> + Armv8.1-M Mainline Architecture has been added. + + Armv8.1-M Mainline can be enabled by using the -march=armv8.1-m.main command line option. + + Support for the + https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/helium-intrinsics";> + MVE beta ACLE intrinsics has been added. These intrinsics can be enabled by including the arm_mve.h header file + and passing the +mve or +mve.fp option extensions (for example: -march=armv8.1-m.main+mve). Support for the Custom Datapath Extension beta ACLE https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added. diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index d1a7df0a9259292d097c1c3b9daeab56329ea435..57ca749da72ed64da37b3eb5404cf5cde8be44dd 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -717,6 +717,7 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); Arm Cortex-A77 (cortex-a77). Arm Cortex-A76AE (cortex-a76ae). Arm Cortex-M35P (cortex-m35p). + Arm Cortex-M55 (cortex-m55). The GCC identifiers can be used as arguments to the -mcpu or -mtune options, @@ -733,6 +734,16 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); added: this M-profile feature is no longer restricted to targets with MOVT. For example, -mcpu=cortex-m0 now supports this option. + Support for the + https://developer.arm.com/architectures/cpu-architecture/m-profile";> + Armv8.1-M Mainline Architecture has been added. + + Armv8.1-M Mainline can be enabled by using the -march=armv8.1-m.main command line option. + + Support for the + https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/helium-intrinsics";> + MVE beta ACLE intrinsics has been added. These intrinsics can be enabled by including the arm_mve.h header file + and passing the +mve or +mve.fp option extensions (for example: -march=armv8.1-m.main+mve). Support for the Custom Datapath Extension beta ACLE https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added.
[ARM][wwwdocs]: Document Armv8.1-M Mainline Security Extensions changes.
Armv8.1-M Mainline Security Extensions related changes in GCC-10. ### Attachment also inlined for ease of reply### diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index 57ca749da72ed64da37b3eb5404cf5cde8be44dd..10bf3b78c7769b73c808bd2c2fe60ebbfc9c887e 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -747,6 +747,9 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); Support for the Custom Datapath Extension beta ACLE https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added. + Support for Armv8.1-M Mainline Security Extensions architecture has been added. The -mcmse option, + when used in combination with an Armv8.1-M Mainline architecture (for example, -march=armv8.1.m.main -mcmse), + now leads to the generation of improved code sequences when changing security states. diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index 57ca749da72ed64da37b3eb5404cf5cde8be44dd..10bf3b78c7769b73c808bd2c2fe60ebbfc9c887e 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -747,6 +747,9 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); Support for the Custom Datapath Extension beta ACLE https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added. + Support for Armv8.1-M Mainline Security Extensions architecture has been added. The -mcmse option, + when used in combination with an Armv8.1-M Mainline architecture (for example, -march=armv8.1.m.main -mcmse), + now leads to the generation of improved code sequences when changing security states.
[ARM]: Fix typo in documentation.
The command line option to enable Armv8.1-M Mainline Security Extensions has a typo and this patch corrects it. Committed it under the obvious rule. ### Attachment also inlined for ease of reply### diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index 5bcffdd7e7b6014cbf577b7e738c199a6a63ec0b..2bc25d923b334678f418f443ec6ba9c763fd0e97 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -820,7 +820,7 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added. Support for Armv8.1-M Mainline Security Extensions architecture has been added. The -mcmse option, - when used in combination with an Armv8.1-M Mainline architecture (for example: -march=armv8.1.m.main -mcmse), + when used in combination with an Armv8.1-M Mainline architecture (for example: -march=armv8.1-m.main -mcmse), now leads to the generation of improved code sequences when changing security states. diff --git a/htdocs/gcc-10/changes.html b/htdocs/gcc-10/changes.html index 5bcffdd7e7b6014cbf577b7e738c199a6a63ec0b..2bc25d923b334678f418f443ec6ba9c763fd0e97 100644 --- a/htdocs/gcc-10/changes.html +++ b/htdocs/gcc-10/changes.html @@ -820,7 +820,7 @@ typedef svbool_t pred512 __attribute__((arm_sve_vector_bits(512))); https://developer.arm.com/docs/101028/0010/custom-datapath-extension";> intrinsics has been added. Support for Armv8.1-M Mainline Security Extensions architecture has been added. The -mcmse option, - when used in combination with an Armv8.1-M Mainline architecture (for example: -march=armv8.1.m.main -mcmse), + when used in combination with an Armv8.1-M Mainline architecture (for example: -march=armv8.1-m.main -mcmse), now leads to the generation of improved code sequences when changing security states.
RE: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Hi Martin, > -Original Message- > From: Martin LiŔka > Sent: 20 May 2020 11:51 > To: Srinath Parvathaneni ; Christophe Lyon > > Cc: Richard Earnshaw ; gcc Patches patc...@gcc.gnu.org> > Subject: Re: [GCC][PATCH][ARM]: Fix the wrong code-gen generated by MVE > vector load/store intrinsics (PR94959). > > Hi. > > You forgot to install testsuite/ChangeLog entries. Please do it so that > comparison of future gcc-changelog won't miss these entries. Thanks for pointing out this, I have installed testsuite/ChangeLog entries and committed the patch. Regards, SRI. > > Martin
[PATCH][GCC] arm: Fix the MVE ACLE vbicq intrinsics.
Hello, Following MVE intrinsic testcases are failing in GCC testsuite. Directory: gcc.target/arm/mve/intrinsics/ Testcases: vbicq_f16.c, vbicq_f32.c, vbicq_s16.c, vbicq_s32.c, vbicq_s8.c ,vbicq_u16.c, vbicq_u32.c and vbicq_u8.c. This patch fixes the vbicq intrinsics by modifying the intrinsic parameters and polymorphic variants in "arm_mve.h" header file. Please refer to M-profile Vector Extension (MVE) intrinsics [1]for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for master and gcc-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-05-20 Srinath Parvathaneni * config/arm/arm_mve.h (__arm_vbicq_n_u16): Correct the intrinsic arguments. (__arm_vbicq_n_s16): Likewise. (__arm_vbicq_n_u32): Likewise. (__arm_vbicq_n_s32): Likewise. (__arm_vbicq): Modify polymorphic variant. gcc/testsuite/ChangeLog: 2020-05-20 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vbicq_f16.c: Modify. * gcc.target/arm/mve/intrinsics/vbicq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 1002512a98f9364403f66eba0e320fe5070bdc3a..9bc5c97db8fea15d8140d966bc501b8a457a1abf 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -6361,7 +6361,7 @@ __arm_vorrq_n_u16 (uint16x8_t __a, const int __imm) __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u16 (uint16x8_t __a, const uint16_t __imm) +__arm_vbicq_n_u16 (uint16x8_t __a, const int __imm) { return __builtin_mve_vbicq_n_uv8hi (__a, __imm); } @@ -6473,7 +6473,7 @@ __arm_vorrq_n_s16 (int16x8_t __a, const int __imm) __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s16 (int16x8_t __a, const int16_t __imm) +__arm_vbicq_n_s16 (int16x8_t __a, const int __imm) { return __builtin_mve_vbicq_n_sv8hi (__a, __imm); } @@ -6564,7 +6564,7 @@ __arm_vorrq_n_u32 (uint32x4_t __a, const int __imm) __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u32 (uint32x4_t __a, const uint32_t __imm) +__arm_vbicq_n_u32 (uint32x4_t __a, const int __imm) { return __builtin_mve_vbicq_n_uv4si (__a, __imm); } @@ -6676,7 +6676,7 @@ __arm_vorrq_n_s32 (int32x4_t __a, const int __imm) __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s32 (int32x4_t __a, const int32_t __imm) +__arm_vbicq_n_s32 (int32x4_t __a, const int __imm) { return __builtin_mve_vbicq_n_sv4si (__a, __imm); } @@ -23182,7 +23182,7 @@ __arm_vorrq (uint16x8_t __a, const int __imm) __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, const uint16_t __imm) +__arm_vbicq (uint16x8_t __a, const int __imm) { return __arm_vbicq_n_u16 (__a, __imm); } @@ -23294,7 +23294,7 @@ __arm_vorrq (int16x8_t __a, const int __imm) __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, const int16_t __imm) +__arm_vbicq (int16x8_t __a, const int __imm) { return __arm_vbicq_n_s16 (__a, __imm); } @@ -23385,7 +23385,7 @@ __arm_vorrq (uint32x4_t __a, const int __imm) __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, const uint32_t __imm) +__arm_vbicq (uint32x4_t __a, const int __imm) { return __arm_vbicq_n_u32 (__a, __imm); } @@ -23497,7 +23497,7 @@ __arm_vorrq (int32x4_t __a, const int __imm) __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, const int32_t __imm) +__arm_vbicq (int32x4_t __a, const int __imm) { return __arm_vbicq_n_s32 (__a, __imm); } @@ -35963,10 +35963,10 @@ extern void *__ARM_undef; #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Gen
[GCC][PATCH][ARM]: Correct the grouping of operands in MVE vector scatter store intrinsics (PR94735).
Hello, The operands in RTL patterns of MVE vector scatter store intrinsics are wrongly grouped, because of which few vector loads and stores instructions are wrongly getting optimized out with -O2. A new predicate "mve_scatter_memory" is defined in this patch, this predicate returns TRUE on matching: (mem(reg)) for MVE scatter store intrinsics. This patch fixes the issue by adding define_expand pattern with "mve_scatter_memory" predicate and calls the corresponding define_insn by passing register_operand as first argument. This register_operand is extracted from the operand with "mve_scatter_memory" predicate in define_expand pattern. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-06-02 Srinath Parvathaneni PR target/94735 * config/arm//predicates.md (mve_scatter_memory): Define to match (mem (reg)) for scatter store memory. * config/arm/mve.md (mve_vstrbq_scatter_offset_): Modify define_insn to define_expand. (mve_vstrbq_scatter_offset_p_): Likewise. (mve_vstrhq_scatter_offset_): Likewise. (mve_vstrhq_scatter_shifted_offset_p_): Likewise. (mve_vstrhq_scatter_shifted_offset_): Likewise. (mve_vstrdq_scatter_offset_p_v2di): Likewise. (mve_vstrdq_scatter_offset_v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_p_v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_v2di): Likewise. (mve_vstrhq_scatter_offset_fv8hf): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf): Likewise. (mve_vstrwq_scatter_offset_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_v4si): Likewise. (mve_vstrwq_scatter_offset_v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_v4si): Likewise. (mve_vstrbq_scatter_offset__insn): Define insn for scatter stores. (mve_vstrbq_scatter_offset_p__insn): Likewise. (mve_vstrhq_scatter_offset__insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p__insn): Likewise. (mve_vstrhq_scatter_shifted_offset__insn): Likewise. (mve_vstrdq_scatter_offset_p_v2di_insn): Likewise. (mve_vstrdq_scatter_offset_v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_p_v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_v2di_insn): Likewise. (mve_vstrhq_scatter_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf_insn): Likewise. (mve_vstrwq_scatter_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_v4si_insn): Likewise. (mve_vstrwq_scatter_offset_v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_v4si_insn): Likewise. gcc/testsuite/ChangeLog: 2020-06-02 Srinath Parvathanenisrinath.parvathan...@arm.com PR target/94735 * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base.c: New test. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset_p.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 986fbfe2abae5f1e91e65f1ff5c84709c43c4617..3a57901bd5bcd770832d59dc77cd92b6d9b5ecb4 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -8102,22 +8102,29 @@ ;; ;; [vstrbq_scatter_offset_s vstrbq_scatter_offset_u] ;; -(define_insn "mve_vstrbq_scatter_offset_" - [(set (match_operand: 0 "memory_operand" "=Us") - (unspec: - [(match_operand:MVE_2 1 "s_register_operand" "w") -(match_ope
[PATCH][GCC] arm: Fix the MVE ACLE vaddq_m polymorphic variants.
Hello, This patch fixes the MVE ACLE vaddq_m polymorphic variants by modifying the corresponding intrinsic parameters and vaddq_m polymorphic variant's _Generic case entries in "arm_mve.h" header file. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for master and gcc-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-06-04 Srinath Parvathaneni * config/arm/arm_mve.h (__arm_vaddq_m_n_s8): Correct the intrinsic arguments. (__arm_vaddq_m_n_s32): Likewise. (__arm_vaddq_m_n_s16): Likewise. (__arm_vaddq_m_n_u8): Likewise. (__arm_vaddq_m_n_u32): Likewise. (__arm_vaddq_m_n_u16): Likewise. (__arm_vaddq_m): Modify polymorphic variant. gcc/testsuite/ChangeLog: 2020-06-04 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/mve_vaddq_m.c: New test. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 1002512a98f9364403f66eba0e320fe5070bdc3a..9ea146189883c09c71842f10d25dc9924b5ae7e3 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -9713,42 +9713,42 @@ __arm_vabdq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pr __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int8_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv16qi (__inactive, __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int32_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv4si (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int16_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv8hi (__inactive, __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv16qi (__inactive, __a, __b, __p); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv4si (__inactive, __a, __b, __p); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv8hi (__inactive, __a, __b, __p); } @@ -26493,42 +26493,42 @@ __arm_vabdq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16 __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int8_t __b, mve_pred16_t __p) +__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s8 (__inactive, __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int32_t __b, mve_pred16_t __p) +__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s32 (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int16_t __b, mve_pred16_t __p) +__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s16 (__inactive, __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, mve_pred16_t __p)
[PATCH][GCC] arm: Fix MVE scalar shift intrinsics code-gen.
Hello, This patch modifies the MVE scalar shift RTL patterns. The current patterns have wrong constraints and predicates due to which the values returned from MVE scalar shift instructions are overwritten in the code-gen. example: $ cat x.c #include "arm_mve.h" int32_t foo(int64_t acc, int shift) { return sqrshrl_sat48 (acc, shift); } Code-gen before applying this patch: $ arm-none-eabi-gcc -march=armv8.1-m.main+mve -mfloat-abi=hard -O2 -S $ cat x.s foo: push{r4, r5} sqrshrl r0, r1, #48, r2 > (a) mov r0, r4 > (b) pop {r4, r5} bx lr Code-gen after applying this patch: foo: sqrshrl r0, r1, #48, r2 bx lr In the current compiler the return value (r0) from sqrshrl (a) is getting overwritten by the mov statement (b). This patch fixes above issue. Please refer to M-profile Vector Extension (MVE) intrinsics [1]for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for master and gcc-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-06-12 Srinath Parvathaneni * config/arm/mve.md (mve_uqrshll_sat_di): Correct the predicate and constraint of all the operands. (mve_sqrshrl_sat_di): Likewise. (mve_uqrshl_si): Likewise. (mve_sqrshr_si): Likewise. (mve_uqshll_di): Likewise. (mve_urshrl_di): Likewise. (mve_uqshl_si): Likewise. (mve_urshr_si): Likewise. (mve_sqshl_si): Likewise. (mve_srshr_si): Likewise. (mve_srshrl_di): Likewise. (mve_sqshll_di): Likewise. * config/arm/predicates.md (arm_low_register_operand): Define. gcc/testsuite/ChangeLog: 2020-06-12 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: New test. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise. diff Description: diff
[PATCH][GCC-10 Backport] arm: Fix unintentional fall throughs in arm.c
Hi all, This small patch fix some unintentional fall-throughs in `mve_vector_mem_operand'. Regtested and bootstraped on arm-linux-gnueabihf. Okay for GCC-10 branch? Regards, Srinath gcc/ChangeLog 2020-06-09 Srinath Parvathaneni Backported from mainline 2020-05-28 Andrea Corallo * config/arm/arm.c (mve_vector_mem_operand): Fix unwanted fall-throughs. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 01bc1b8ae9b72700ca5ae0840ee4496fd686b623..a7b7c55a763c66382bc140a4c504840c509df84c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -13302,26 +13302,31 @@ mve_vector_mem_operand (machine_mode mode, rtx op, bool strict) if (abs_hwi (val)) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V8HImode: case E_V8HFmode: if (abs (val) <= 255) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V8QImode: case E_V4QImode: if (abs_hwi (val)) return (reg_no <= LAST_LO_REGNUM || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V4HImode: case E_V4HFmode: if (val % 2 == 0 && abs (val) <= 254) return (reg_no <= LAST_LO_REGNUM || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V4SImode: case E_V4SFmode: if (val % 4 == 0 && abs (val) <= 508) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V2DImode: case E_V2DFmode: case E_TImode: diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 01bc1b8ae9b72700ca5ae0840ee4496fd686b623..a7b7c55a763c66382bc140a4c504840c509df84c 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -13302,26 +13302,31 @@ mve_vector_mem_operand (machine_mode mode, rtx op, bool strict) if (abs_hwi (val)) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V8HImode: case E_V8HFmode: if (abs (val) <= 255) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V8QImode: case E_V4QImode: if (abs_hwi (val)) return (reg_no <= LAST_LO_REGNUM || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V4HImode: case E_V4HFmode: if (val % 2 == 0 && abs (val) <= 254) return (reg_no <= LAST_LO_REGNUM || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V4SImode: case E_V4SFmode: if (val % 4 == 0 && abs (val) <= 508) return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM) || (!strict && reg_no >= FIRST_PSEUDO_REGISTER)); + return FALSE; case E_V2DImode: case E_V2DFmode: case E_TImode:
[PATCH][GCC-10 Backport] arm: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Hello, Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler instructions generated by current compiler are wrong. eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`. But as per Arm-arm second argument in above instructions must also be a low register (<= r7). This patch fixes this issue by creating a new predicate "mve_memory_operand" and constraint "Ux" which allows low registers as arguments to the generated instructions depending on the mode of the argument. A new constraint "Ul" is created to handle loading to PC-relative addressing modes for vector store/load intrinsiscs. All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32 are modified in this patch. Please refer to M-profile Vector Extension (MVE) intrinsics [1] and Armv8-M Architecture Reference Manual [2] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics [2] https://developer.arm.com/docs/ddi0553/latest Regression tested on arm-none-eabi and found no regressions. Ok for gcc-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-06-09 Srinath Parvathaneni Backported from mainline 2020-05-20 Srinath Parvathaneni Andre Vieira PR target/94959 * config/arm/arm-protos.h (arm_mode_base_reg_class): Function declaration. (mve_vector_mem_operand): Likewise. * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check the load from memory to a core register is legitimate for give mode. (mve_vector_mem_operand): Define function. (arm_print_operand): Modify comment. (arm_mode_base_reg_class): Define. * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE. * config/arm/constraints.md (Ux): Likewise. (Ul): Likewise. * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also add support for missing Vector Store Register and Vector Load Register. Add a new alternative to support load from memory to PC (or label) in vector store/load. (mve_vstrbq_): Modify constraint Us to Ux. (mve_vldrbq_): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrbq_z_): Modify constraint Us to Ux. (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrhq_): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vldrhq_z_fv8hf): Likewise. (mve_vldrhq_z_): Likewise. (mve_vldrwq_fv4sf): Likewise. (mve_vldrwq_v4si): Likewise. (mve_vldrwq_z_fv4sf): Likewise. (mve_vldrwq_z_v4si): Likewise. (mve_vld1q_f): Modify constriant Us to Ux. (mve_vld1q_): Likewise. (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand. (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to mve_memory_operand and also modify the MVE instructions to emit. (mve_vstrhq_p_): Likewise. (mve_vstrhq_): Modify constriant Us to Ux, predicate to mve_memory_operand. (mve_vstrwq_fv4sf): Modify constriant Us to Ux. (mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE instructions to emit. (mve_vstrwq_p_v4si): Likewise. (mve_vstrwq_v4si): Likewise.Modify constriant Us to Ux. * config/arm/predicates.md (mve_memory_operand): Define. gcc/testsuite/ChangeLog: 2020-06-09 Srinath Parvathaneni Backported from mainline 2020-05-20 Srinath Parvathaneni PR target/94959 * gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Modify. * gcc.target/arm/mve/intrinsics/mve_vldr.c: New test. * gcc.target/arm/mve/intrinsics/mve_vldr_z.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstr.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstr_p.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_f16.c: Modify. * gcc.target/arm/mve/intrinsics/vld1q_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise. * gcc.target/arm/mve/i
[PATCH][GCC-10 Backport] arm: Fix the MVE ACLE vbicq intrinsics.
Hello, Following MVE intrinsic testcases are failing in GCC testsuite. Directory: gcc.target/arm/mve/intrinsics/ Testcases: vbicq_f16.c, vbicq_f32.c, vbicq_s16.c, vbicq_s32.c, vbicq_s8.c ,vbicq_u16.c, vbicq_u32.c and vbicq_u8.c. This patch fixes the vbicq intrinsics by modifying the intrinsic parameters and polymorphic variants in "arm_mve.h" header file. Please refer to M-profile Vector Extension (MVE) intrinsics [1]for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for GCC-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-05-20 Srinath Parvathaneni Backported from mainline 2020-06-04 Srinath Parvathaneni * config/arm/arm_mve.h (__arm_vbicq_n_u16): Correct the intrinsic arguments. (__arm_vbicq_n_s16): Likewise. (__arm_vbicq_n_u32): Likewise. (__arm_vbicq_n_s32): Likewise. (__arm_vbicq): Modify polymorphic variant. gcc/testsuite/ChangeLog: 2020-05-20 Srinath Parvathaneni Backported from mainline 2020-06-04 Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vbicq_f16.c: Modify. * gcc.target/arm/mve/intrinsics/vbicq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vbicq_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 1002512a98f9364403f66eba0e320fe5070bdc3a..9bc5c97db8fea15d8140d966bc501b8a457a1abf 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -6361,7 +6361,7 @@ __arm_vorrq_n_u16 (uint16x8_t __a, const int __imm) __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u16 (uint16x8_t __a, const uint16_t __imm) +__arm_vbicq_n_u16 (uint16x8_t __a, const int __imm) { return __builtin_mve_vbicq_n_uv8hi (__a, __imm); } @@ -6473,7 +6473,7 @@ __arm_vorrq_n_s16 (int16x8_t __a, const int __imm) __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s16 (int16x8_t __a, const int16_t __imm) +__arm_vbicq_n_s16 (int16x8_t __a, const int __imm) { return __builtin_mve_vbicq_n_sv8hi (__a, __imm); } @@ -6564,7 +6564,7 @@ __arm_vorrq_n_u32 (uint32x4_t __a, const int __imm) __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_u32 (uint32x4_t __a, const uint32_t __imm) +__arm_vbicq_n_u32 (uint32x4_t __a, const int __imm) { return __builtin_mve_vbicq_n_uv4si (__a, __imm); } @@ -6676,7 +6676,7 @@ __arm_vorrq_n_s32 (int32x4_t __a, const int __imm) __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_n_s32 (int32x4_t __a, const int32_t __imm) +__arm_vbicq_n_s32 (int32x4_t __a, const int __imm) { return __builtin_mve_vbicq_n_sv4si (__a, __imm); } @@ -23182,7 +23182,7 @@ __arm_vorrq (uint16x8_t __a, const int __imm) __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint16x8_t __a, const uint16_t __imm) +__arm_vbicq (uint16x8_t __a, const int __imm) { return __arm_vbicq_n_u16 (__a, __imm); } @@ -23294,7 +23294,7 @@ __arm_vorrq (int16x8_t __a, const int __imm) __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int16x8_t __a, const int16_t __imm) +__arm_vbicq (int16x8_t __a, const int __imm) { return __arm_vbicq_n_s16 (__a, __imm); } @@ -23385,7 +23385,7 @@ __arm_vorrq (uint32x4_t __a, const int __imm) __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (uint32x4_t __a, const uint32_t __imm) +__arm_vbicq (uint32x4_t __a, const int __imm) { return __arm_vbicq_n_u32 (__a, __imm); } @@ -23497,7 +23497,7 @@ __arm_vorrq (int32x4_t __a, const int __imm) __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq (int32x4_t __a, const int32_t __imm) +__arm_vbicq (int32x4_t __a, const int __imm) { return __arm_vbicq_n_s32 (__a, __imm); } @@ -35963,10 +35963,10 @@ e
[PATCH][GCC-10 Backport] arm: Correct the grouping of operands in MVE vector scatter store intrinsics (PR94735).
Hello, The operands in RTL patterns of MVE vector scatter store intrinsics are wrongly grouped, because of which few vector loads and stores instructions are wrongly getting optimized out with -O2. A new predicate "mve_scatter_memory" is defined in this patch, this predicate returns TRUE on matching: (mem(reg)) for MVE scatter store intrinsics. This patch fixes the issue by adding define_expand pattern with "mve_scatter_memory" predicate and calls the corresponding define_insn by passing register_operand as first argument. This register_operand is extracted from the operand with "mve_scatter_memory" predicate in define_expand pattern. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for GCC-10 branch? Thanks, Srinath. gcc/ChangeLog: 2020-06-09 Srinath Parvathaneni Backported from mainline 2020-06-04 Srinath Parvathaneni PR target/94735 * config/arm//predicates.md (mve_scatter_memory): Define to match (mem (reg)) for scatter store memory. * config/arm/mve.md (mve_vstrbq_scatter_offset_): Modify define_insn to define_expand. (mve_vstrbq_scatter_offset_p_): Likewise. (mve_vstrhq_scatter_offset_): Likewise. (mve_vstrhq_scatter_shifted_offset_p_): Likewise. (mve_vstrhq_scatter_shifted_offset_): Likewise. (mve_vstrdq_scatter_offset_p_v2di): Likewise. (mve_vstrdq_scatter_offset_v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_p_v2di): Likewise. (mve_vstrdq_scatter_shifted_offset_v2di): Likewise. (mve_vstrhq_scatter_offset_fv8hf): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf): Likewise. (mve_vstrwq_scatter_offset_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_offset_p_v4si): Likewise. (mve_vstrwq_scatter_offset_v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf): Likewise. (mve_vstrwq_scatter_shifted_offset_p_v4si): Likewise. (mve_vstrwq_scatter_shifted_offset_v4si): Likewise. (mve_vstrbq_scatter_offset__insn): Define insn for scatter stores. (mve_vstrbq_scatter_offset_p__insn): Likewise. (mve_vstrhq_scatter_offset__insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p__insn): Likewise. (mve_vstrhq_scatter_shifted_offset__insn): Likewise. (mve_vstrdq_scatter_offset_p_v2di_insn): Likewise. (mve_vstrdq_scatter_offset_v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_p_v2di_insn): Likewise. (mve_vstrdq_scatter_shifted_offset_v2di_insn): Likewise. (mve_vstrhq_scatter_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_offset_p_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_fv8hf_insn): Likewise. (mve_vstrhq_scatter_shifted_offset_p_fv8hf_insn): Likewise. (mve_vstrwq_scatter_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_offset_p_v4si_insn): Likewise. (mve_vstrwq_scatter_offset_v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_fv4sf_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_p_v4si_insn): Likewise. (mve_vstrwq_scatter_shifted_offset_v4si_insn): Likewise. gcc/testsuite/ChangeLog: 2020-06-09 Srinath Parvathaneni Backported from mainline 2020-06-04 Srinath Parvathaneni PR target/94735 * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base.c: New test. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_base_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_offset_p.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_vstore_scatter_shifted_offset_p.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 986fbfe2abae5f1e91e65f1ff5c84709c43c4617..3a57901bd5bcd770832d59dc77cd92b6d9b5ecb4 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -8102,22 +8102,29 @@ ;; ;; [vstrbq_scatter_offset_s vstrbq_scatter_offset_u] ;; -(define_insn "mve_vstrbq_scatter_offset_" - [(set (match_operand: 0 "memory_operand" "=Us") - (unspec: -
[PATCH][GCC-10 Backport] arm: Fix MVE scalar shift intrinsics code-gen.
Hello, This patch modifies the MVE scalar shift RTL patterns. The current patterns have wrong constraints and predicates due to which the values returned from MVE scalar shift instructions are overwritten in the code-gen. example: $ cat x.c int32_t foo(int64_t acc, int shift) { return sqrshrl_sat48 (acc, shift); } Code-gen before applying this patch: $ arm-none-eabi-gcc -march=armv8.1-m.main+mve -mfloat-abi=hard -O2 -S $ cat x.s foo: push{r4, r5} sqrshrl r0, r1, #48, r2 > (a) mov r0, r4 > (b) pop {r4, r5} bx lr Code-gen after applying this patch: foo: sqrshrl r0, r1, #48, r2 bx lr In the current compiler the return value (r0) from sqrshrl (a) is getting overwritten by the mov statement (b). This patch fixes above issue. Regression tested on arm-none-eabi and found no regressions. Ok for gcc-10 branch? Thanks, Srinath. 2020-06-12 Srinath Parvathaneni gcc/ * config/arm/mve.md (mve_uqrshll_sat_di): Correct the predicate and constraint of all the operands. (mve_sqrshrl_sat_di): Likewise. (mve_uqrshl_si): Likewise. (mve_sqrshr_si): Likewise. (mve_uqshll_di): Likewise. (mve_urshrl_di): Likewise. (mve_uqshl_si): Likewise. (mve_urshr_si): Likewise. (mve_sqshl_si): Likewise. (mve_srshr_si): Likewise. (mve_srshrl_di): Likewise. (mve_sqshll_di): Likewise. * config/arm/predicates.md (arm_low_register_operand): Define. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: New test. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise. (cherry picked from commit 6af598703f919b56f628c496843cdfe6f0cb8276) ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 3a57901bd5bcd770832d59dc77cd92b6d9b5ecb4..9758862ac2bb40805dc5b66c9b05466fffcf91df 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -11344,9 +11344,9 @@ ;; [uqrshll_di] ;; (define_insn "mve_uqrshll_sat_di" - [(set (match_operand:DI 0 "arm_general_register_operand" "+r") - (unspec:DI [(match_operand:DI 1 "arm_general_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r")] + [(set (match_operand:DI 0 "arm_low_register_operand" "=l") + (unspec:DI [(match_operand:DI 1 "arm_low_register_operand" "0") + (match_operand:SI 2 "register_operand" "r")] UQRSHLLQ))] "TARGET_HAVE_MVE" "uqrshll%?\\t%Q1, %R1, #, %2" @@ -11356,9 +11356,9 @@ ;; [sqrshrl_di] ;; (define_insn "mve_sqrshrl_sat_di" - [(set (match_operand:DI 0 "arm_general_register_operand" "+r") - (unspec:DI [(match_operand:DI 1 "arm_general_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r")] + [(set (match_operand:DI 0 "arm_low_register_operand" "=l") + (unspec:DI [(match_operand:DI 1 "arm_low_register_operand" "0") + (match_operand:SI 2 "register_operand" "r")] SQRSHRLQ))] "TARGET_HAVE_MVE" "sqrshrl%?\\t%Q1, %R1, #, %2" @@ -11368,9 +11368,9 @@ ;; [uqrshl_si] ;; (define_insn "mve_uqrshl_si" - [(set (match_operand:SI 0 "arm_general_register_operand" "+r") - (unspec:SI [(match_operand:SI 1 "arm_general_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r")] + [(set (match_operand:SI 0 "arm_general_register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "arm_general_register_operand" "0") + (match_operand:SI 2 "register_operand" "r")] UQRSHL))] "TARGET_HAVE_MVE" "uqrshl%?\\t%1, %2" @@ -11380,9 +11380,9 @@ ;; [sqrshr_si] ;; (define_insn "mve_sqrshr_si" - [(set (match_operand:SI 0 "arm_general_register_operand" "+r") - (unspec:SI [(match_operand:SI 1 "arm_general_register_operand" "r") - (match_operand:SI 2 "s_register_operand" "r")] + [(set (match_operand:SI 0 "arm_general_register_operand" "=r") + (unspec:SI [(match_operand:SI 1 "arm_general_register_operand" "0") + (match_operand:SI 2 "register_operand" "r")] SQRSHR))] "TARGET_HAVE_MVE" &quo
[PATCH][GCC-10 Backport] arm: Fix the MVE ACLE vaddq_m polymorphic variants.
Hello, This patch fixes the MVE ACLE vaddq_m polymorphic variants by modifying the corresponding intrinsic parameters and vaddq_m polymorphic variant's _Generic case entries in "arm_mve.h" header file. Regression tested on arm-none-eabi and found no regressions. Ok for gcc-10 branch? Thanks, Srinath. 2020-06-04 Srinath Parvathaneni gcc/ * config/arm/arm_mve.h (__arm_vaddq_m_n_s8): Correct the intrinsic arguments. (__arm_vaddq_m_n_s32): Likewise. (__arm_vaddq_m_n_s16): Likewise. (__arm_vaddq_m_n_u8): Likewise. (__arm_vaddq_m_n_u32): Likewise. (__arm_vaddq_m_n_u16): Likewise. (__arm_vaddq_m): Modify polymorphic variant. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_vaddq_m.c: New test. (cherry picked from commit dc39db873670bea8d8e655444387ceaa53a01a79) ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 9bc5c97db8fea15d8140d966bc501b8a457a1abf..a801705ced582105df60ccdc79a7500b320e12d4 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -9713,42 +9713,42 @@ __arm_vabdq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pr __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int8_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv16qi (__inactive, __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int32_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv4si (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int16_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_sv8hi (__inactive, __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv16qi (__inactive, __a, __b, __p); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv4si (__inactive, __a, __b, __p); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16_t __b, mve_pred16_t __p) +__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, int __b, mve_pred16_t __p) { return __builtin_mve_vaddq_m_n_uv8hi (__inactive, __a, __b, __p); } @@ -26493,42 +26493,42 @@ __arm_vabdq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16 __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int8_t __b, mve_pred16_t __p) +__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s8 (__inactive, __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int32_t __b, mve_pred16_t __p) +__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s32 (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int16_t __b, mve_pred16_t __p) +__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_s16 (__inactive, __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vaddq_m (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, mve_pred16_t __p) +__arm_vaddq_m (uint8x16_t __inactive, uint8x16_t __a, int __b, mve_pred16_t __p) { return __arm_vaddq_m_n_u8 (__inactive, __a, __b, __p); } __extension__ extern __in
[PATCH][GCC] arm: Fix the failing mve scalar shift execution tests.
Hello, In GCC testsuite the MVE scalar shift execution tests (mve_scalar_shifts[1-4].c) are failings because of executing them on target hardware which doesn't support MVE instructions. This patch restricts those tests to execute only on target hardware that support MVE instructions. Regression tested on arm-none-eabi and found no regressions. Ok for master? Ok for GCC-10 branch? Thanks, Srinath. 2020-06-18 Srinath Parvathaneni gcc/ * doc/sourcebuild.texi (arm_v8_1m_mve_fp_ok): Add item. (arm_mve_hw): Likewise. gcc/testsuite/ * gcc.target/arm/mve/mve.exp (check_effective_target_arm_mve_hw): Check availability of hardware supporting MVE instructions. * lib/target-supports.exp (check_effective_target_arm_mve_hw): Define. ### Attachment also inlined for ease of reply### diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index e5554c4666eb217844598352ced5a9597652f07c..a12af822443ad41afba816ed640d80d182859239 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1925,6 +1925,15 @@ ARM target supports options to generate instructions from ARMv8.1-M with the M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options. +@item arm_v8_1m_mve_fp_ok +ARM target supports options to generate instructions from ARMv8.1-M with +the Half-precision floating-point instructions (HP), Floating-point Extension +(FP) along with M-Profile Vector Extension (MVE). Some multilibs may be +incompatible with these options. + +@item arm_mve_hw +Test system supports executing MVE instructions. + @item arm_v8m_main_cde ARM target supports options to generate instructions from ARMv8-M with the Custom Datapath Extension (CDE). Some multilibs may be incompatible diff --git a/gcc/testsuite/gcc.target/arm/mve/mve.exp b/gcc/testsuite/gcc.target/arm/mve/mve.exp index e84cb068940e2cb4ed7bbdd51b14d52cec3e323f..aaa3b78b2a9400070d0ca33cb0e98bd0fad7d408 100644 --- a/gcc/testsuite/gcc.target/arm/mve/mve.exp +++ b/gcc/testsuite/gcc.target/arm/mve/mve.exp @@ -24,6 +24,11 @@ if ![istarget arm*-*-*] then { # Load support procs. load_lib gcc-dg.exp +# Exit immediately if the target does not support MVE instructions. +if ![check_effective_target_arm_mve_hw] then { +return +} + # If a testcase doesn't have special options, use these. global DEFAULT_CFLAGS if ![info exists DEFAULT_CFLAGS] then { diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 862a0735b49582f7f35a9a4009f5752cf8f7e3ba..ab0ee32fc911622b4716c272df4e656698b81a70 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4671,6 +4671,24 @@ proc check_effective_target_arm_cmse_ok {} { } "-mcmse"]; } +# Return 1 if the target supports executing MVE instructions, 0 +# otherwise. + +proc check_effective_target_arm_mve_hw {} { +return [check_runtime arm_mve_hw_available { + int + main (void) + { + long long a = 16; + int b = 3; + asm ("sqrshrl %Q1, %R1, #64, %2" + : "=l" (a) + : "0" (a), "r" (b)); + return (a != 2); + } +} ""] +} + # Return 1 if this is an ARM target where ARMv8-M Security Extensions with # clearing instructions (clrm, vscclrm, vstr/vldr with FPCXT) is available. diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index e5554c4666eb217844598352ced5a9597652f07c..a12af822443ad41afba816ed640d80d182859239 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1925,6 +1925,15 @@ ARM target supports options to generate instructions from ARMv8.1-M with the M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options. +@item arm_v8_1m_mve_fp_ok +ARM target supports options to generate instructions from ARMv8.1-M with +the Half-precision floating-point instructions (HP), Floating-point Extension +(FP) along with M-Profile Vector Extension (MVE). Some multilibs may be +incompatible with these options. + +@item arm_mve_hw +Test system supports executing MVE instructions. + @item arm_v8m_main_cde ARM target supports options to generate instructions from ARMv8-M with the Custom Datapath Extension (CDE). Some multilibs may be incompatible diff --git a/gcc/testsuite/gcc.target/arm/mve/mve.exp b/gcc/testsuite/gcc.target/arm/mve/mve.exp index e84cb068940e2cb4ed7bbdd51b14d52cec3e323f..aaa3b78b2a9400070d0ca33cb0e98bd0fad7d408 100644 --- a/gcc/testsuite/gcc.target/arm/mve/mve.exp +++ b/gcc/testsuite/gcc.target/arm/mve/mve.exp @@ -24,6 +24,11 @@ if ![istarget arm*-*-*] then { # Load support procs. load_lib gcc-dg.exp +# Exit immediately if the target does not support MVE instructions. +if ![check_effective_target_arm_mve_hw] then { +return +} + # If a testcase doesn't have special option
RE: [PATCH][GCC] arm: Fix the failing mve scalar shift execution tests.
Hi, > -Original Message- > From: Christophe Lyon > Sent: 18 June 2020 14:38 > To: Srinath Parvathaneni > Cc: gcc Patches > Subject: Re: [PATCH][GCC] arm: Fix the failing mve scalar shift execution > tests. > > Hi, > > > On Thu, 18 Jun 2020 at 15:30, Srinath Parvathaneni > wrote: > > > > Hello, > > > > In GCC testsuite the MVE scalar shift execution tests > > (mve_scalar_shifts[1-4].c) are failings because of executing them on > > target hardware which doesn't support MVE instructions. This patch > restricts those tests to execute only on target hardware that support MVE > instructions. > > > > Regression tested on arm-none-eabi and found no regressions. > > > > Ok for master? Ok for GCC-10 branch? > > > > Thanks, > > Srinath. > > > > 2020-06-18 Srinath Parvathaneni > > > > gcc/ > > * doc/sourcebuild.texi (arm_v8_1m_mve_fp_ok): Add item. > > (arm_mve_hw): Likewise. > > > > gcc/testsuite/ > > * gcc.target/arm/mve/mve.exp (check_effective_target_arm_mve_hw): > Check > > availability of hardware supporting MVE instructions. > > * lib/target-supports.exp (check_effective_target_arm_mve_hw): > Define. > > > > > > ### Attachment also inlined for ease of reply > ### > > > > > > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index > > > e5554c4666eb217844598352ced5a9597652f07c..a12af822443ad41afba816ed > 640d > > 80d182859239 100644 > > --- a/gcc/doc/sourcebuild.texi > > +++ b/gcc/doc/sourcebuild.texi > > @@ -1925,6 +1925,15 @@ ARM target supports options to generate > > instructions from ARMv8.1-M with the M-Profile Vector Extension > > (MVE). Some multilibs may be incompatible with these options. > > > > +@item arm_v8_1m_mve_fp_ok > > +ARM target supports options to generate instructions from ARMv8.1-M > > +with the Half-precision floating-point instructions (HP), > > +Floating-point Extension > > +(FP) along with M-Profile Vector Extension (MVE). Some multilibs may > > +be incompatible with these options. > > + > > +@item arm_mve_hw > > +Test system supports executing MVE instructions. > > + > > @item arm_v8m_main_cde > > ARM target supports options to generate instructions from ARMv8-M > > with the Custom Datapath Extension (CDE). Some multilibs may be > > incompatible diff --git a/gcc/testsuite/gcc.target/arm/mve/mve.exp > > b/gcc/testsuite/gcc.target/arm/mve/mve.exp > > index > > > e84cb068940e2cb4ed7bbdd51b14d52cec3e323f..aaa3b78b2a9400070d0ca33 > cb0e9 > > 8bd0fad7d408 100644 > > --- a/gcc/testsuite/gcc.target/arm/mve/mve.exp > > +++ b/gcc/testsuite/gcc.target/arm/mve/mve.exp > > @@ -24,6 +24,11 @@ if ![istarget arm*-*-*] then { # Load support > > procs. > > load_lib gcc-dg.exp > > > > +# Exit immediately if the target does not support MVE instructions. > > +if ![check_effective_target_arm_mve_hw] then { > > +return > > +} > > So you are going to skip all tests, not only the ones with "dg-do run", right? No, that was a mistake, I wanted to skip only the tests with "dg-do run". > Don't you want to add this effective target test to the offending tests only? Done it now and attached updated cover letter and patch in this e-mail. Thanks for pointing this out. > > Thanks, > > Christophe > > > + > > # If a testcase doesn't have special options, use these. > > global DEFAULT_CFLAGS > > if ![info exists DEFAULT_CFLAGS] then { diff --git > > a/gcc/testsuite/lib/target-supports.exp > > b/gcc/testsuite/lib/target-supports.exp > > index > > > 862a0735b49582f7f35a9a4009f5752cf8f7e3ba..ab0ee32fc911622b4716c272d > f4e > > 656698b81a70 100644 > > --- a/gcc/testsuite/lib/target-supports.exp > > +++ b/gcc/testsuite/lib/target-supports.exp > > @@ -4671,6 +4671,24 @@ proc check_effective_target_arm_cmse_ok {} { > > } "-mcmse"]; > > } > > > > +# Return 1 if the target supports executing MVE instructions, 0 # > > +otherwise. > > + > > +proc check_effective_target_arm_mve_hw {} { > > +return [check_runtime arm_mve_hw_available { > > + int > > + main (void) > > + { > > + long long a = 16; > > + int b = 3; > > + asm ("sqrshrl %Q1, %R1, #64, %2" > > + : "=l" (a) > > + : "0" (a), "r" (b)); > &
RE: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift intrinsics code-gen.
Hi, > -Original Message- > From: Christophe Lyon > Sent: 18 June 2020 16:06 > To: Kyrylo Tkachov > Cc: Srinath Parvathaneni ; gcc- > patc...@gcc.gnu.org > Subject: Re: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift intrinsics > code-gen. > > Hi, > > On Thu, 18 Jun 2020 at 11:43, Kyrylo Tkachov > wrote: > > > > > > > > > -Original Message- > > > From: Srinath Parvathaneni > > > Sent: 17 June 2020 17:17 > > > To: gcc-patches@gcc.gnu.org > > > Cc: Kyrylo Tkachov > > > Subject: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift > > > intrinsics code- gen. > > > > > > Hello, > > > > > > This patch modifies the MVE scalar shift RTL patterns. The current > > > patterns have wrong constraints and predicates due to which the > > > values returned from MVE scalar shift instructions are overwritten > > > in the code-gen. > > > > > > example: > > > $ cat x.c > > > int32_t foo(int64_t acc, int shift) { > > > return sqrshrl_sat48 (acc, shift); } > > > > > > Code-gen before applying this patch: > > > $ arm-none-eabi-gcc -march=armv8.1-m.main+mve -mfloat-abi=hard -O2 > > > -S $ cat x.s > > > foo: > > >push{r4, r5} > > >sqrshrl r0, r1, #48, r2 > (a) > > >mov r0, r4 > (b) > > >pop {r4, r5} > > >bx lr > > > > > > Code-gen after applying this patch: > > > foo: > > >sqrshrl r0, r1, #48, r2 > > >bx lr > > > > > > In the current compiler the return value (r0) from sqrshrl (a) is > > > getting overwritten by the mov statement (b). > > > This patch fixes above issue. > > > > > > Regression tested on arm-none-eabi and found no regressions. > > > > > > Ok for gcc-10 branch? > > > > > > > Ok. > > Thanks, > > Kyrill > > > > > Thanks, > > > Srinath. > > > > > > 2020-06-12 Srinath Parvathaneni > > > > > > gcc/ > > > * config/arm/mve.md (mve_uqrshll_sat_di): Correct the > > > predicate > > > and constraint of all the operands. > > > (mve_sqrshrl_sat_di): Likewise. > > > (mve_uqrshl_si): Likewise. > > > (mve_sqrshr_si): Likewise. > > > (mve_uqshll_di): Likewise. > > > (mve_urshrl_di): Likewise. > > > (mve_uqshl_si): Likewise. > > > (mve_urshr_si): Likewise. > > > (mve_sqshl_si): Likewise. > > > (mve_srshr_si): Likewise. > > > (mve_srshrl_di): Likewise. > > > (mve_sqshll_di): Likewise. > > > * config/arm/predicates.md (arm_low_register_operand): Define. > > > > > > gcc/testsuite/ > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: New test. > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise. > > > > > These new tests fails to link on arm-linux-gnueabi* targets with no m-profile > multilibs: > FAIL: gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c (test for excess > errors) Excess errors: > /aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/tools/arm-none-linux- > gnueabi/bin/ld: > error: ./mve_scalar_shifts1.o: conflicting architecture profiles M/A > /aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/tools/arm-none-linux- > gnueabi/bin/ld: > failed to merge target specific data of file ./mve_scalar_shifts1.o > > probably because arm_v8_1m_mve_ok only checks that compilation is OK > and does not try to link > > (same applies to the trunk commit) Following patch is posted to fix above failures. https://gcc.gnu.org/pipermail/gcc-patches/2020-June/548507.html Regards, SRI. > > Christophe > > > > > (cherry picked from commit > 6af598703f919b56f628c496843cdfe6f0cb8276) > > > > > > > > > ### Attachment also inlined for ease of reply > > > ### > > > > > > > > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index > > > > 3a57901bd5bcd770832d59dc77cd92b6d9b5ecb4..9758862ac2bb40805dc5b6 > > > 6c9b05466fffcf91df 100644 > > > --- a/gcc/config/arm/mve.md > > > +++ b/gcc/config/arm/mve.md > > > @@ -11344,9 +11344,9 @@ > > > ;; [uqrshll_di] > &g
RE: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift intrinsics code-gen.
Hi, > -Original Message- > From: Christophe Lyon > Sent: 18 June 2020 16:44 > To: Srinath Parvathaneni > Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift intrinsics > code-gen. > > On Thu, 18 Jun 2020 at 17:34, Srinath Parvathaneni > wrote: > > > > Hi, > > > > > -Original Message- > > > From: Christophe Lyon > > > Sent: 18 June 2020 16:06 > > > To: Kyrylo Tkachov > > > Cc: Srinath Parvathaneni ; gcc- > > > patc...@gcc.gnu.org > > > Subject: Re: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift > > > intrinsics code-gen. > > > > > > Hi, > > > > > > On Thu, 18 Jun 2020 at 11:43, Kyrylo Tkachov > > > > > > wrote: > > > > > > > > > > > > > > > > > -Original Message- > > > > > From: Srinath Parvathaneni > > > > > Sent: 17 June 2020 17:17 > > > > > To: gcc-patches@gcc.gnu.org > > > > > Cc: Kyrylo Tkachov > > > > > Subject: [PATCH][GCC-10 Backport] arm: Fix MVE scalar shift > > > > > intrinsics code- gen. > > > > > > > > > > Hello, > > > > > > > > > > This patch modifies the MVE scalar shift RTL patterns. The > > > > > current patterns have wrong constraints and predicates due to > > > > > which the values returned from MVE scalar shift instructions are > > > > > overwritten in the code-gen. > > > > > > > > > > example: > > > > > $ cat x.c > > > > > int32_t foo(int64_t acc, int shift) { > > > > > return sqrshrl_sat48 (acc, shift); } > > > > > > > > > > Code-gen before applying this patch: > > > > > $ arm-none-eabi-gcc -march=armv8.1-m.main+mve -mfloat-abi=hard > > > > > -O2 -S $ cat x.s > > > > > foo: > > > > >push{r4, r5} > > > > >sqrshrl r0, r1, #48, r2 > (a) > > > > >mov r0, r4 > (b) > > > > >pop {r4, r5} > > > > >bx lr > > > > > > > > > > Code-gen after applying this patch: > > > > > foo: > > > > >sqrshrl r0, r1, #48, r2 > > > > >bx lr > > > > > > > > > > In the current compiler the return value (r0) from sqrshrl (a) > > > > > is getting overwritten by the mov statement (b). > > > > > This patch fixes above issue. > > > > > > > > > > Regression tested on arm-none-eabi and found no regressions. > > > > > > > > > > Ok for gcc-10 branch? > > > > > > > > > > > > > Ok. > > > > Thanks, > > > > Kyrill > > > > > > > > > Thanks, > > > > > Srinath. > > > > > > > > > > 2020-06-12 Srinath Parvathaneni > > > > > > > > > > gcc/ > > > > > * config/arm/mve.md (mve_uqrshll_sat_di): Correct > > > > > the predicate > > > > > and constraint of all the operands. > > > > > (mve_sqrshrl_sat_di): Likewise. > > > > > (mve_uqrshl_si): Likewise. > > > > > (mve_sqrshr_si): Likewise. > > > > > (mve_uqshll_di): Likewise. > > > > > (mve_urshrl_di): Likewise. > > > > > (mve_uqshl_si): Likewise. > > > > > (mve_urshr_si): Likewise. > > > > > (mve_sqshl_si): Likewise. > > > > > (mve_srshr_si): Likewise. > > > > > (mve_srshrl_di): Likewise. > > > > > (mve_sqshll_di): Likewise. > > > > > * config/arm/predicates.md (arm_low_register_operand): Define. > > > > > > > > > > gcc/testsuite/ > > > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: New test. > > > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. > > > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. > > > > > * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise. > > > > > > > > > > > These new tests fails to link on arm-linux-gnueabi* targets with no > > > m-profile > > > multilibs: >
[PATCH][GCC-10 Backport] arm: Fix the failing mve scalar shift execution tests.
Hello, In GCC testsuite the MVE scalar shift execution tests (mve_scalar_shifts[1-4].c) are failings because of executing them on target hardware which doesn't support MVE instructions. This patch restricts those tests to execute only on target hardware that support MVE instructions. Regression tested on arm-none-eabi and found no regressions. Ok for GCC-10 branch? Thanks, Srinath. 2020-06-18 Srinath Parvathaneni gcc/ * doc/sourcebuild.texi (arm_v8_1m_mve_fp_ok): Add item. (arm_mve_hw): Likewise. gcc/testsuite/ * gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c: Modify. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c: Likewise. * gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c: Likewise. * lib/target-supports.exp (check_effective_target_arm_mve_hw): Define. (cherry picked from commit 99abb146fd0923ebda2c7e7681adb18e6798a90c) ### Attachment also inlined for ease of reply### diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 240d6e4b08e2ed1a3742c4e82b98d284b9774216..563d29506e725562c593bbf6a067c06111f84f0e 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1915,6 +1915,15 @@ ARM target supports options to generate instructions from ARMv8.1-M with the M-Profile Vector Extension (MVE). Some multilibs may be incompatible with these options. +@item arm_v8_1m_mve_fp_ok +ARM target supports options to generate instructions from ARMv8.1-M with +the Half-precision floating-point instructions (HP), Floating-point Extension +(FP) along with M-Profile Vector Extension (MVE). Some multilibs may be +incompatible with these options. + +@item arm_mve_hw +Test system supports executing MVE instructions. + @item arm_v8m_main_cde ARM target supports options to generate instructions from ARMv8-M with the Custom Datapath Extension (CDE). Some multilibs may be incompatible diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c index e1c136e7f302c1824f0b00b5e7bc468ff5fcfe27..db2335fc76ed3204fcfc033ef6777dc85dbd465f 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts1.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-require-effective-target arm_mve_hw } */ /* { dg-options "-O2" } */ /* { dg-add-options arm_v8_1m_mve } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c index 0b5a8edb15849913d4f2849ab86decb286692386..5a63d385cb7e7d18345974421035f3090f409554 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts2.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-require-effective-target arm_mve_hw } */ /* { dg-options "-O2" } */ /* { dg-add-options arm_v8_1m_mve } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c index 7e3da54f5e62467e2cdaabd85df3db127f608802..f0d5ee32f829c482cbeb4f1fb1f714e01b43da87 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts3.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-require-effective-target arm_mve_hw } */ /* { dg-options "-O2" } */ /* { dg-add-options arm_v8_1m_mve } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c index 8bee12f7fdfaef51daf7e2f5d3c1e284115d2649..283742fcf982b6cf2746e7745a3c5258e75d3982 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_scalar_shifts4.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-require-effective-target arm_mve_hw } */ /* { dg-options "-O2" } */ /* { dg-add-options arm_v8_1m_mve } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 094bb5218aaebb02762696dc2c5a2d6742cb7c6c..eb3ef095357e6e17b4aa34b698f8b19f8a038498 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4625,6 +4625,24 @@ proc check_effective_target_arm_cmse_ok {} { } "-mcmse"]; } +# Return 1 if the target supports executing MVE instructions, 0 +# otherwise. + +proc check_effective_target_arm_mve_hw {} { +return [check_runtim
[PATCH][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hello, This patch is part of MVE ACLE intrinsics framework. This patches add support to update (read/write) the APSR (Application Program Status Register) register and FPSCR (Floating-point Status and Control Register) register for MVE. This patch also enables thumb2 mov RTL patterns for MVE. Please refer to Arm reference manual [1] for more details. [1] https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914 Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2019-11-11 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Add check to not allow TARGET_HAVE_MVE for this pattern. (thumb2_cmse_entry_return): Add TARGET_HAVE_MVE check to update APSR register. * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Define. (VUNSPEC_GET_FPSCR): Remove. * config/arm/vfp.md (thumb2_movhi_vfp): Add TARGET_HAVE_MVE check. (thumb2_movhi_fp16): Add TARGET_HAVE_MVE check. (thumb2_movsi_vfp): Add TARGET_HAVE_MVE check. (movdi_vfp): Add TARGET_HAVE_MVE check. (thumb2_movdf_vfp): Add TARGET_HAVE_MVE check. (thumb2_movsfcc_vfp): Add TARGET_HAVE_MVE check. (thumb2_movdfcc_vfp): Add TARGET_HAVE_MVE check. (push_multi_vfp): Add TARGET_HAVE_MVE check. (set_fpscr): Add TARGET_HAVE_MVE check. (get_fpscr): Add TARGET_HAVE_MVE check. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index 809461a25da5a8058a8afce972dea0d3131effc0..81afd8fcdc1b0a82493dc0758bce16fa9e5fde20 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -435,10 +435,10 @@ (define_insn "*cmovsi_insn" [(set (match_operand:SI 0 "arm_general_register_operand" "=r,r,r,r,r,r,r") (if_then_else:SI -(match_operator 1 "arm_comparison_operator" - [(match_operand 2 "cc_register" "") (const_int 0)]) -(match_operand:SI 3 "arm_reg_or_m1_or_1" "r, r,UM, r,U1,UM,U1") -(match_operand:SI 4 "arm_reg_or_m1_or_1" "r,UM, r,U1, r,UM,U1")))] + (match_operator 1 "arm_comparison_operator" +[(match_operand 2 "cc_register" "") (const_int 0)]) + (match_operand:SI 3 "arm_reg_or_m1_or_1" "r, r,UM, r,U1,UM,U1") + (match_operand:SI 4 "arm_reg_or_m1_or_1" "r,UM, r,U1, r,UM,U1")))] "TARGET_THUMB2 && TARGET_COND_ARITH && (!((operands[3] == const1_rtx && operands[4] == constm1_rtx) || (operands[3] == constm1_rtx && operands[4] == const1_rtx)))" @@ -540,7 +540,7 @@ [(match_operand 4 "cc_register" "") (const_int 0)]) (match_operand:SF 1 "s_register_operand" "0,r") (match_operand:SF 2 "s_register_operand" "r,0")))] - "TARGET_THUMB2 && TARGET_SOFT_FLOAT" + "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE" "@ it\\t%D3\;mov%D3\\t%0, %2 it\\t%d3\;mov%d3\\t%0, %1" @@ -1226,7 +1226,7 @@ ; added to clear the APSR and potentially the FPSCR if VFP is available, so ; we adapt the length accordingly. (set (attr "length") - (if_then_else (match_test "TARGET_HARD_FLOAT") + (if_then_else (match_test "TARGET_HARD_FLOAT || TARGET_HAVE_MVE") (const_int 34) (const_int 8))) ; We do not support predicate execution of returns from cmse_nonsecure_entry diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index b3b4f8ee3e2d1bdad968a9dd8ccbc72ded274f48..ac7fe7d0af19f1965356d47d8327e24d410b99bd 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -170,6 +170,7 @@ UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt TORC instruction. UNSPEC_TORVSC; Used by the intrinsic form of the iWMMXt TORVSC instruction. UNSPEC_TEXTRC; Used by the intrinsic form of the iWMMXt TEXTRC instruction. + UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. ]) @@ -216,7 +217,6 @@ VUNSPEC_SLX ; Represent a store-register-release-exclusive. VUNSPEC_LDA ; Represent a store-register-acquire. VUNSPEC_STL ; Represent a store-register-release. - VUNSPEC_GET_FPSCR; Represent fetch of FPSCR content. VUNSPEC_SET_FPSCR; Represent assign of FPSCR content. VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. VUNSPEC_CDP ; Represent the coprocessor cdp instruction. diff --git a/gcc/config/arm/
[PATCH][ARM][GCC][3/1x]: MVE intrinsics with unary operand.
Hello, This patch supports following MVE ACLE intrinsics with unary operand. vdupq_n_s8, vdupq_n_s16, vdupq_n_s32, vabsq_s8, vabsq_s16, vabsq_s32, vclsq_s8, vclsq_s16, vclsq_s32, vclzq_s8, vclzq_s16, vclzq_s32, vnegq_s8, vnegq_s16, vnegq_s32, vaddlvq_s32, vaddvq_s8, vaddvq_s16, vaddvq_s32, vmovlbq_s8, vmovlbq_s16, vmovltq_s8, vmovltq_s16, vmvnq_s8, vmvnq_s16, vmvnq_s32, vrev16q_s8, vrev32q_s8, vrev32q_s16, vqabsq_s8, vqabsq_s16, vqabsq_s32, vqnegq_s8, vqnegq_s16, vqnegq_s32, vcvtaq_s16_f16, vcvtaq_s32_f32, vcvtnq_s16_f16, vcvtnq_s32_f32, vcvtpq_s16_f16, vcvtpq_s32_f32, vcvtmq_s16_f16, vcvtmq_s32_f32, vmvnq_u8, vmvnq_u16, vmvnq_u32, vdupq_n_u8, vdupq_n_u16, vdupq_n_u32, vclzq_u8, vclzq_u16, vclzq_u32, vaddvq_u8, vaddvq_u16, vaddvq_u32, vrev32q_u8, vrev32q_u16, vmovltq_u8, vmovltq_u16, vmovlbq_u8, vmovlbq_u16, vrev16q_u8, vaddlvq_u32, vcvtpq_u16_f16, vcvtpq_u32_f32, vcvtnq_u16_f16, vcvtmq_u16_f16, vcvtmq_u32_f32, vcvtaq_u16_f16, vcvtaq_u32_f32, vdupq_n, vabsq, vclsq, vclzq, vnegq, vaddlvq, vaddvq, vmovlbq, vmovltq, vmvnq, vrev16q, vrev32q, vqabsq, vqnegq. A new register class "EVEN_REGS" which allows only even registers is added in this patch. The new constraint "e" allows only reigsters of EVEN_REGS class. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm.h (enum reg_class): Define new class EVEN_REGS. * config/arm/arm_mve.h (vdupq_n_s8): Define macro. (vdupq_n_s16): Likewise. (vdupq_n_s32): Likewise. (vabsq_s8): Likewise. (vabsq_s16): Likewise. (vabsq_s32): Likewise. (vclsq_s8): Likewise. (vclsq_s16): Likewise. (vclsq_s32): Likewise. (vclzq_s8): Likewise. (vclzq_s16): Likewise. (vclzq_s32): Likewise. (vnegq_s8): Likewise. (vnegq_s16): Likewise. (vnegq_s32): Likewise. (vaddlvq_s32): Likewise. (vaddvq_s8): Likewise. (vaddvq_s16): Likewise. (vaddvq_s32): Likewise. (vmovlbq_s8): Likewise. (vmovlbq_s16): Likewise. (vmovltq_s8): Likewise. (vmovltq_s16): Likewise. (vmvnq_s8): Likewise. (vmvnq_s16): Likewise. (vmvnq_s32): Likewise. (vrev16q_s8): Likewise. (vrev32q_s8): Likewise. (vrev32q_s16): Likewise. (vqabsq_s8): Likewise. (vqabsq_s16): Likewise. (vqabsq_s32): Likewise. (vqnegq_s8): Likewise. (vqnegq_s16): Likewise. (vqnegq_s32): Likewise. (vcvtaq_s16_f16): Likewise. (vcvtaq_s32_f32): Likewise. (vcvtnq_s16_f16): Likewise. (vcvtnq_s32_f32): Likewise. (vcvtpq_s16_f16): Likewise. (vcvtpq_s32_f32): Likewise. (vcvtmq_s16_f16): Likewise. (vcvtmq_s32_f32): Likewise. (vmvnq_u8): Likewise. (vmvnq_u16): Likewise. (vmvnq_u32): Likewise. (vdupq_n_u8): Likewise. (vdupq_n_u16): Likewise. (vdupq_n_u32): Likewise. (vclzq_u8): Likewise. (vclzq_u16): Likewise. (vclzq_u32): Likewise. (vaddvq_u8): Likewise. (vaddvq_u16): Likewise. (vaddvq_u32): Likewise. (vrev32q_u8): Likewise. (vrev32q_u16): Likewise. (vmovltq_u8): Likewise. (vmovltq_u16): Likewise. (vmovlbq_u8): Likewise. (vmovlbq_u16): Likewise. (vrev16q_u8): Likewise. (vaddlvq_u32): Likewise. (vcvtpq_u16_f16): Likewise. (vcvtpq_u32_f32): Likewise. (vcvtnq_u16_f16): Likewise. (vcvtmq_u16_f16): Likewise. (vcvtmq_u32_f32): Likewise. (vcvtaq_u16_f16): Likewise. (vcvtaq_u32_f32): Likewise. (__arm_vdupq_n_s8): Define intrinsic. (__arm_vdupq_n_s16): Likewise. (__arm_vdupq_n_s32): Likewise. (__arm_vabsq_s8): Likewise. (__arm_vabsq_s16): Likewise. (__arm_vabsq_s32): Likewise. (__arm_vclsq_s8): Likewise. (__arm_vclsq_s16): Likewise. (__arm_vclsq_s32): Likewise. (__arm_vclzq_s8): Likewise. (__arm_vclzq_s16): Likewise. (__arm_vclzq_s32): Likewise. (__arm_vnegq_s8): Likewise. (__arm_vnegq_s16): Likewise. (__arm_vnegq_s32): Likewise. (__arm_vaddlvq_s32): Likewise. (__arm_vaddvq_s8): Likewise. (__arm_vaddvq_s16): Likewise. (__arm_vaddvq_s32): Likewise. (__arm_vmovlbq_s8): Likewise. (__arm_vmovlbq_s16): Likewise. (__arm_vmovltq_s8): Likewise. (__arm_vmovltq_s16): Likewise. (__arm_vmvnq_s8): Likewise. (__arm_vmvnq_s16): Likewise. (__arm_
[PATCH][ARM][GCC][3/x]: MVE ACLE intrinsics framework patch.
Hello, This patch is part of MVE ACLE intrinsics framework. The patch supports the use of emulation for the double-precision arithmetic operations for MVE. This changes are to support the MVE ACLE intrinsics which operates on vector floating point arithmetic operations. Please refer to Arm reference manual [1] for more details. [1] https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914 Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-11 Andre Vieira Srinath Parvathaneni * config/arm/arm.c (arm_libcall_uses_aapcs_base): Modify function to add emulator calls for dobule precision arithmetic operations for MVE. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6faed76206b93c1a9dea048e2f693dc16ee58072..358b2638b65a2007d1c7e8062844b67682597f45 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5658,9 +5658,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) /* Values from double-precision helper functions are returned in core registers if the selected core only supports single-precision arithmetic, even if we are using the hard-float ABI. The same is -true for single-precision helpers, but we will never be using the -hard-float ABI on a CPU which doesn't support single-precision -operations in hardware. */ +true for single-precision helpers except in case of MVE, because in +MVE we will be using the hard-float ABI on a CPU which doesn't support +single-precision operations in hardware. In MVE the following check +enables use of emulation for the double-precision arithmetic +operations. */ + if (TARGET_HAVE_MVE) + { + add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode)); + } add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (smul_optab, DFmode)); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6faed76206b93c1a9dea048e2f693dc16ee58072..358b2638b65a2007d1c7e8062844b67682597f45 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -5658,9 +5658,25 @@ arm_libcall_uses_aapcs_base (const_rtx libcall) /* Values from double-precision helper functions are returned in core registers if the selected core only supports single-precision arithmetic, even if we are using the hard-float ABI. The same is -true for single-precision helpers, but we will never be using the -hard-float ABI on a CPU which doesn't support single-precision -operations in hardware. */ +true for single-precision helpers except in case of MVE, because in +MVE we will be using the hard-float ABI on a CPU which doesn't support +single-precision operations in hardware. In MVE the following check +enables use of emulation for the double-precision arithmetic +operations. */ + if (TARGET_HAVE_MVE) + { + add_libcall (libcall_htab, optab_libfunc (add_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sdiv_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (smul_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (neg_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (sub_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (eq_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (lt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (le_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (ge_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (gt_optab, SFmode)); + add_libcall (libcall_htab, optab_libfunc (unord_optab, SFmode)); + } add_libcall (libcall_htab, optab_libfunc (add_optab, DFmode)); add_libcall (libcall_htab, optab_libfunc (sdiv_optab, DFmode)); add_l
[PATCH][ARM][GCC][4/1x]: MVE intrinsics with unary operand.
Hello, This patch supports following MVE ACLE intrinsics with unary operand. vctp16q, vctp32q, vctp64q, vctp8q, vpnot. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics There are few conflicts in defining the machine registers, resolved by re-ordering VPR_REGNUM, APSRQ_REGNUM and APSRGE_REGNUM. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-12 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (hi_UP): Define mode. * config/arm/arm.h (IS_VPR_REGNUM): Move. * config/arm/arm.md (VPR_REGNUM): Define before APSRQ_REGNUM. (APSRQ_REGNUM): Modify. (APSRGE_REGNUM): Modify. * config/arm/arm_mve.h (vctp16q): Define macro. (vctp32q): Likewise. (vctp64q): Likewise. (vctp8q): Likewise. (vpnot): Likewise. (__arm_vctp16q): Define intrinsic. (__arm_vctp32q): Likewise. (__arm_vctp64q): Likewise. (__arm_vctp8q): Likewise. (__arm_vpnot): Likewise. * config/arm/arm_mve_builtins.def (UNOP_UNONE_UNONE): Use builtin qualifier. * config/arm/mve.md (mve_vctpqhi): Define RTL pattern. (mve_vpnothi): Likewise. gcc/testsuite/ChangeLog: 2019-11-12 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vctp16q.c: New test. * gcc.target/arm/mve/intrinsics/vctp32q.c: Likewise. * gcc.target/arm/mve/intrinsics/vctp64q.c: Likewise. * gcc.target/arm/mve/intrinsics/vctp8q.c: Likewise. * gcc.target/arm/mve/intrinsics/vpnot.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 21b213d8e1bc99a3946f15e97161e01d73832799..cd82aa159089c288607e240de02a85dcbb134a14 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -387,6 +387,7 @@ arm_set_sat_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define oi_UP E_OImode #define hf_UP E_HFmode #define si_UP E_SImode +#define hi_UPE_HImode #define void_UP E_VOIDmode #define UP(X) X##_UP diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 485db72f05f16ca389227289a35c232dc982bf9d..95ec7963a57a1a5652a0a9dc30391a0ce6348242 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -955,6 +955,9 @@ extern int arm_arch_cmse; #define IS_IWMMXT_GR_REGNUM(REGNUM) \ (((REGNUM) >= FIRST_IWMMXT_GR_REGNUM) && ((REGNUM) <= LAST_IWMMXT_GR_REGNUM)) +#define IS_VPR_REGNUM(REGNUM) \ + ((REGNUM) == VPR_REGNUM) + /* Base register for access to local variables of the function. */ #define FRAME_POINTER_REGNUM 102 @@ -999,7 +1002,7 @@ extern int arm_arch_cmse; && (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1)) /* The number of hard registers is 16 ARM + 1 CC + 1 SFP + 1 AFP - + 1 APSRQ + 1 APSRGE + 1 VPR. */ + +1 VPR + 1 APSRQ + 1 APSRGE. */ /* Intel Wireless MMX Technology registers add 16 + 4 more. */ /* VFP (VFP3) adds 32 (64) + 1 VFPCC. */ #define FIRST_PSEUDO_REGISTER 107 @@ -1101,13 +1104,10 @@ extern int arm_regs_in_sequence[]; /* Registers not for general use. */\ CC_REGNUM, VFPCC_REGNUM, \ FRAME_POINTER_REGNUM, ARG_POINTER_REGNUM,\ - SP_REGNUM, PC_REGNUM, APSRQ_REGNUM, APSRGE_REGNUM, \ - VPR_REGNUM \ + SP_REGNUM, PC_REGNUM, VPR_REGNUM, APSRQ_REGNUM,\ + APSRGE_REGNUM\ } -#define IS_VPR_REGNUM(REGNUM) \ - ((REGNUM) == VPR_REGNUM) - /* Use different register alloc ordering for Thumb. */ #define ADJUST_REG_ALLOC_ORDER arm_order_regs_for_local_alloc () diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 689baa0b0ff63ef90f47d2fd844cb98c9a1457a0..2a90482a873f8250a3b2b1dec141669f55e0c58b 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -39,9 +39,9 @@ (LAST_ARM_REGNUM 15) ; (CC_REGNUM 100) ; Condition code pseudo register (VFPCC_REGNUM101) ; VFP Condition code pseudo register - (APSRQ_REGNUM104) ; Q bit pseudo register - (APSRGE_REGNUM 105) ; GE bits pseudo register - (VPR_REGNUM 106) ; Vector Predication Register - MVE register. + (VPR_REGNUM 104) ; Vector Predication Register - MVE register. + (APSRQ_REGNUM105) ; Q bit pseudo register + (APSRGE_REGNUM 106) ; GE bits pseudo register ] ) ;; 3rd operand to select_dominance_cc_mode diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 1d357180ba9ddb26347b55cde625903bdb09eef6..c8d9b6471634725cea9bab3f9fa145810b506938 100644 --- a/gcc/confi
[PATCH][ARM][GCC][2/1x]: MVE intrinsics with unary operand.
Hello, This patch supports following MVE ACLE intrinsics with unary operand. vmvnq_n_s16, vmvnq_n_s32, vrev64q_s8, vrev64q_s16, vrev64q_s32, vcvtq_s16_f16, vcvtq_s32_f32, vrev64q_u8, vrev64q_u16, vrev64q_u32, vmvnq_n_u16, vmvnq_n_u32, vcvtq_u16_f16, vcvtq_u32_f32, vrev64q. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (UNOP_SNONE_SNONE_QUALIFIERS): Define. (UNOP_SNONE_NONE_QUALIFIERS): Likewise. (UNOP_SNONE_IMM_QUALIFIERS): Likewise. (UNOP_UNONE_NONE_QUALIFIERS): Likewise. (UNOP_UNONE_UNONE_QUALIFIERS): Likewise. (UNOP_UNONE_IMM_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vmvnq_n_s16): Define macro. (vmvnq_n_s32): Likewise. (vrev64q_s8): Likewise. (vrev64q_s16): Likewise. (vrev64q_s32): Likewise. (vcvtq_s16_f16): Likewise. (vcvtq_s32_f32): Likewise. (vrev64q_u8): Likewise. (vrev64q_u16): Likewise. (vrev64q_u32): Likewise. (vmvnq_n_u16): Likewise. (vmvnq_n_u32): Likewise. (vcvtq_u16_f16): Likewise. (vcvtq_u32_f32): Likewise. (__arm_vmvnq_n_s16): Define intrinsic. (__arm_vmvnq_n_s32): Likewise. (__arm_vrev64q_s8): Likewise. (__arm_vrev64q_s16): Likewise. (__arm_vrev64q_s32): Likewise. (__arm_vrev64q_u8): Likewise. (__arm_vrev64q_u16): Likewise. (__arm_vrev64q_u32): Likewise. (__arm_vmvnq_n_u16): Likewise. (__arm_vmvnq_n_u32): Likewise. (__arm_vcvtq_s16_f16): Likewise. (__arm_vcvtq_s32_f32): Likewise. (__arm_vcvtq_u16_f16): Likewise. (__arm_vcvtq_u32_f32): Likewise. (vrev64q): Define polymorphic variant. * config/arm/arm_mve_builtins.def (UNOP_SNONE_SNONE): Use it. (UNOP_SNONE_NONE): Likewise. (UNOP_SNONE_IMM): Likewise. (UNOP_UNONE_UNONE): Likewise. (UNOP_UNONE_NONE): Likewise. (UNOP_UNONE_IMM): Likewise. * config/arm/mve.md (mve_vrev64q_): Define RTL pattern. (mve_vcvtq_from_f_): Likewise. (mve_vmvnq_n_): Likewise. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vcvtq_s16_f16.c: New test. * gcc.target/arm/mve/intrinsics/vcvtq_s32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_u16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_u32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmvnq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 2fee417fe6585f457edd4cf96655366b1d6bd1a0..21b213d8e1bc99a3946f15e97161e01d73832799 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -313,6 +313,42 @@ arm_unop_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define UNOP_NONE_UNONE_QUALIFIERS \ (arm_unop_none_unone_qualifiers) +static enum arm_type_qualifiers +arm_unop_snone_snone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none }; +#define UNOP_SNONE_SNONE_QUALIFIERS \ + (arm_unop_snone_snone_qualifiers) + +static enum arm_type_qualifiers +arm_unop_snone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none }; +#define UNOP_SNONE_NONE_QUALIFIERS \ + (arm_unop_snone_none_qualifiers) + +static enum arm_type_qualifiers +arm_unop_snone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_immediate }; +#define UNOP_SNONE_IMM_QUALIFIERS \ + (arm_unop_snone_imm_qualifiers) + +static enum arm_type_qualifiers +arm_unop_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_none }; +#define UNOP_UNONE_NONE_QUALIFIERS \ + (arm_unop_unone_none_qualifiers) + +static enum arm_type_qualifiers +arm_unop_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned }; +#define UNOP_UNONE_UNONE_QUALIFIERS
[PATCH][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.
Hello, This patch creates the required framework for MVE ACLE intrinsics. The following changes are done in this patch to support MVE ACLE intrinsics. Header file arm_mve.h is added to source code, which contains the definitions of MVE ACLE intrinsics and different data types used in MVE. Machine description file mve.md is also added which contains the RTL patterns defined for MVE. A new reigster "p0" is added which is used in by MVE predicated patterns. A new register class "VPR_REG" is added and its contents are defined in REG_CLASS_CONTENTS. The vec-common.md file is modified to support the standard move patterns. The prefix of neon functions which are also used by MVE is changed from "neon_" to "simd_". eg: neon_immediate_valid_for_move changed to simd_immediate_valid_for_move. In the patch standard patterns mve_move, mve_store and move_load for MVE are added and neon.md and vfp.md files are modified to support this common patterns. Please refer to Arm reference manual [1] for more details. [1] https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914 Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath gcc/ChangeLog: 2019-11-11 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config.gcc (arm_mve.h): Add header file. * config/arm/aout.h (p0): Add new register name. * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define. (ARM_BUILTIN_NEON_LANE_CHECK): Remove. (arm_init_simd_builtin_types): Add TARGET_HAVE_MVE check. (arm_init_neon_builtins): Move a check to arm_init_builtins function. (arm_init_builtins): Move a check from arm_init_neon_builtins function. (mve_dereference_pointer): Add new function. (arm_expand_builtin_args): Add TARGET_HAVE_MVE check. (arm_expand_neon_builtin): Move a check to arm_expand_builtin function. (arm_expand_builtin): Move a check from arm_expand_neon_builtin function. * config/arm/arm-c.c (arm_cpu_builtins): Define macros for MVE. * config/arm/arm-modes.def (INT_MODE): Add three new integer modes. * config/arm/arm-protos.h (neon_immediate_valid_for_move): Rename function. (simd_immediate_valid_for_move): Rename neon_immediate_valid_for_move function. * config/arm/arm.c (arm_options_perform_arch_sanity_checks):Enable mve isa bit. (use_return_insn): Add TARGET_HAVE_MVE check. (aapcs_vfp_allocate): Add TARGET_HAVE_MVE check. (aapcs_vfp_allocate_return_reg): Add TARGET_HAVE_MVE check. (thumb2_legitimate_address_p): Add TARGET_HAVE_MVE check. (arm_rtx_costs_internal): Add TARGET_HAVE_MVE check. (neon_valid_immediate): Rename to simd_valid_immediate. (simd_valid_immediate): Rename from neon_valid_immediate. (neon_immediate_valid_for_move): Rename to simd_immediate_valid_for_move. (simd_immediate_valid_for_move): Rename from neon_immediate_valid_for_move. (neon_immediate_valid_for_logic): Modify call to neon_valid_immediate function. (neon_make_constant): Modify call to neon_valid_immediate function. (neon_vector_mem_operand): Add TARGET_HAVE_MVE check. (output_move_neon): Add TARGET_HAVE_MVE check. (arm_compute_frame_layout): Add TARGET_HAVE_MVE check. (arm_save_coproc_regs): Add TARGET_HAVE_MVE check. (arm_print_operand): Add case 'E' to print memory operands. (arm_print_operand_address): Add TARGET_HAVE_MVE check. (arm_hard_regno_mode_ok): Add TARGET_HAVE_MVE check. (arm_modes_tieable_p): Add TARGET_HAVE_MVE check. (arm_regno_class): Add VPR_REGNUM check. (arm_expand_epilogue_apcs_frame): Add TARGET_HAVE_MVE check. (arm_expand_epilogue): Add TARGET_HAVE_MVE check. (arm_vector_mode_supported_p): Add TARGET_HAVE_MVE check for MVE vector modes. (arm_array_mode_supported_p): Add TARGET_HAVE_MVE check. (arm_conditional_register_usage): For TARGET_HAVE_MVE enable VPR register. * config/arm/arm.h (IS_VPR_REGNUM): Macro to check for VPR register. (FIRST_PSEUDO_REGISTER): Modify. (VALID_MVE_MODE): Define. (VALID_MVE_SI_MODE): Define. (VALID_MVE_SF_MODE): Define. (VALID_MVE_STRUCT_MODE): Define. (REG_ALLOC_ORDER): Add VPR_REGNUM entry. (enum reg_class): Add VPR_REG entry. (REG_CLASS_NAMES): Add VPR_REG entry. * config/arm/arm.md (VPR_REGNUM): Define. (arm_movsf_soft_insn): Add TARGET_HAVE_MVE check to not allow MVE. (vfp_pop_multiple_with_writeback): Add TARGET_HAVE_MVE check to allow writeback. (include "mve.md"): Include mve.md file. * config/arm/arm_mve.h: New file. * config/arm/constraints.md (Up): Define.
[PATCH][ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.
Hello, This patch supports MVE ACLE intrinsics vst4q_s8, vst4q_s16, vst4q_s32, vst4q_u8, vst4q_u16, vst4q_u32, vst4q_f16 and vst4q_f32. In this patch arm_mve_builtins.def file is added to the source code in which the builtins for MVE ACLE intrinsics are defined using builtin qualifiers. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-12 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (CF): Define mve_builtin_data. (VAR1): Define. (ARM_BUILTIN_MVE_PATTERN_START): Define. (arm_init_mve_builtins): Define function. (arm_init_builtins): Add TARGET_HAVE_MVE check. (arm_expand_builtin_1): Check the range of fcode. (arm_expand_mve_builtin): Define function to expand MVE builtins. (arm_expand_builtin): Check the range of fcode. * config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point types. (__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace. (vst4q_s8): Define macro. (vst4q_s16): Likewise. (vst4q_s32): Likewise. (vst4q_u8): Likewise. (vst4q_u16): Likewise. (vst4q_u32): Likewise. (vst4q_f16): Likewise. (vst4q_f32): Likewise. (__arm_vst4q_s8): Define inline builtin. (__arm_vst4q_s16): Likewise. (__arm_vst4q_s32): Likewise. (__arm_vst4q_u8): Likewise. (__arm_vst4q_u16): Likewise. (__arm_vst4q_u32): Likewise. (__arm_vst4q_f16): Likewise. (__arm_vst4q_f32): Likewise. (__ARM_mve_typeid): Define macro with MVE types. (__ARM_mve_coerce): Define macro with _Generic feature. (vst4q): Define polymorphic variant for different vst4q builtins. * config/arm/arm_mve_builtins.def: New file. * config/arm/mve.md (MVE_VLD_ST): Define iterator. (unspec): Define unspec. (mve_vst4q): Define RTL pattern. * config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def. (arm-builtins.o): Likewise. gcc/testsuite/ChangeLog: 2019-11-12 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test. * gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index d4cb0ea3deb49b10266d1620c85e243ed34aee4d..a9f76971ef310118bf7edea6a8dd3de1da46b46b 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -401,6 +401,13 @@ static arm_builtin_datum neon_builtin_data[] = }; #undef CF +#define CF(N,X) CODE_FOR_mve_##N##X +static arm_builtin_datum mve_builtin_data[] = +{ +#include "arm_mve_builtins.def" +}; + +#undef CF #undef VAR1 #define VAR1(T, N, A) \ {#N, UP (A), CODE_FOR_arm_##N, 0, T##_QUALIFIERS}, @@ -705,6 +712,13 @@ enum arm_builtins #include "arm_acle_builtins.def" + ARM_BUILTIN_MVE_BASE, + +#undef VAR1 +#define VAR1(T, N, X) \ + ARM_BUILTIN_MVE_##N##X, +#include "arm_mve_builtins.def" + ARM_BUILTIN_MAX }; @@ -714,6 +728,9 @@ enum arm_builtins #define ARM_BUILTIN_NEON_PATTERN_START \ (ARM_BUILTIN_NEON_BASE + 1) +#define ARM_BUILTIN_MVE_PATTERN_START \ + (ARM_BUILTIN_MVE_BASE + 1) + #define ARM_BUILTIN_ACLE_PATTERN_START \ (ARM_BUILTIN_ACLE_BASE + 1) @@ -1219,6 +1236,22 @@ arm_init_acle_builtins (void) } } +/* Set up all the MVE builtins mentioned in arm_mve_builtins.def file. */ +static void +arm_init_mve_builtins (void) +{ + volatile unsigned int i, fcode = ARM_BUILTIN_MVE_PATTERN_START; + + arm_init_simd_builtin_scalar_types (); + arm_init_simd_builtin_types (); + + for (i = 0; i < ARRAY_SIZE (mve_builtin_data); i++, fcode++) +{ + arm_builtin_datum *d = &mve_builtin_data[i]; + arm_init_builtin (fcode, d, "__builtin_mve"); +} +} + /* Set up all the NEON builtins, even builtins for instructions that are not in the current target ISA to allow the user to compile particular modules with different target specific options that differ from the command line @@ -1961,8 +1994,10 @@ arm_init_builtins (void) = add_builtin_funct
[PATCH][ARM][GCC][1/1x]: Patch to support MVE ACLE intrinsics with unary operand.
Hello, This patch supports MVE ACLE intrinsics vcvtq_f16_s16, vcvtq_f32_s32, vcvtq_f16_u16, vcvtq_f32_u32n vrndxq_f16, vrndxq_f32, vrndq_f16, vrndq_f32, vrndpq_f16, vrndpq_f32, vrndnq_f16, vrndnq_f32, vrndmq_f16, vrndmq_f32, vrndaq_f16, vrndaq_f32, vrev64q_f16, vrev64q_f32, vnegq_f16, vnegq_f32, vdupq_n_f16, vdupq_n_f32, vabsq_f16, vabsq_f32, vrev32q_f16, vcvttq_f32_f16, vcvtbq_f32_f16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-17 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (UNOP_NONE_NONE_QUALIFIERS): Define macro. (UNOP_NONE_SNONE_QUALIFIERS): Likewise. (UNOP_NONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vrndxq_f16): Define macro. (vrndxq_f32): Likewise. (vrndq_f16) Likewise.: (vrndq_f32): Likewise. (vrndpq_f16): Likewise. (vrndpq_f32): Likewise. (vrndnq_f16): Likewise. (vrndnq_f32): Likewise. (vrndmq_f16): Likewise. (vrndmq_f32): Likewise. (vrndaq_f16): Likewise. (vrndaq_f32): Likewise. (vrev64q_f16): Likewise. (vrev64q_f32): Likewise. (vnegq_f16): Likewise. (vnegq_f32): Likewise. (vdupq_n_f16): Likewise. (vdupq_n_f32): Likewise. (vabsq_f16): Likewise. (vabsq_f32): Likewise. (vrev32q_f16): Likewise. (vcvttq_f32_f16): Likewise. (vcvtbq_f32_f16): Likewise. (vcvtq_f16_s16): Likewise. (vcvtq_f32_s32): Likewise. (vcvtq_f16_u16): Likewise. (vcvtq_f32_u32): Likewise. (__arm_vrndxq_f16): Define intrinsic. (__arm_vrndxq_f32): Likewise. (__arm_vrndq_f16): Likewise. (__arm_vrndq_f32): Likewise. (__arm_vrndpq_f16): Likewise. (__arm_vrndpq_f32): Likewise. (__arm_vrndnq_f16): Likewise. (__arm_vrndnq_f32): Likewise. (__arm_vrndmq_f16): Likewise. (__arm_vrndmq_f32): Likewise. (__arm_vrndaq_f16): Likewise. (__arm_vrndaq_f32): Likewise. (__arm_vrev64q_f16): Likewise. (__arm_vrev64q_f32): Likewise. (__arm_vnegq_f16): Likewise. (__arm_vnegq_f32): Likewise. (__arm_vdupq_n_f16): Likewise. (__arm_vdupq_n_f32): Likewise. (__arm_vabsq_f16): Likewise. (__arm_vabsq_f32): Likewise. (__arm_vrev32q_f16): Likewise. (__arm_vcvttq_f32_f16): Likewise. (__arm_vcvtbq_f32_f16): Likewise. (__arm_vcvtq_f16_s16): Likewise. (__arm_vcvtq_f32_s32): Likewise. (__arm_vcvtq_f16_u16): Likewise. (__arm_vcvtq_f32_u32): Likewise. (vrndxq): Define polymorphic variants. (vrndq): Likewise. (vrndpq): Likewise. (vrndnq): Likewise. (vrndmq): Likewise. (vrndaq): Likewise. (vrev64q): Likewise. (vnegq): Likewise. (vabsq): Likewise. (vrev32q): Likewise. (vcvtbq_f32): Likewise. (vcvttq_f32): Likewise. (vcvtq): Likewise. * config/arm/arm_mve_builtins.def (VAR2): Define. (VAR1): Define. * config/arm/mve.md (mve_vrndxq_f): Add RTL pattern. (mve_vrndq_f): Likewise. (mve_vrndpq_f): Likewise. (mve_vrndnq_f): Likewise. (mve_vrndmq_f): Likewise. (mve_vrndaq_f): Likewise. (mve_vrev64q_f): Likewise. (mve_vnegq_f): Likewise. (mve_vdupq_n_f): Likewise. (mve_vabsq_f): Likewise. (mve_vrev32q_fv8hf): Likewise. (mve_vcvttq_f32_f16v4sf): Likewise. (mve_vcvtbq_f32_f16v4sf): Likewise. (mve_vcvtq_to_f_): Likewise. gcc/testsuite/ChangeLog: 2019-10-17 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vabsq_f16.c: New test. * gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtbq_f32_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f16_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f16_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f32_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_f32_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvttq_f32_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vdupq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vnegq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vnegq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev32q_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vrev64q_f16.c: Likewise. * gcc.target/arm/mve/intrinsics
[PATCH][ARM][GCC][1/2x]: MVE intrinsics with binary operands.
Hello, This patch supports following MVE ACLE intrinsics with binary operand. vsubq_n_f16, vsubq_n_f32, vbrsrq_n_f16, vbrsrq_n_f32, vcvtq_n_f16_s16, vcvtq_n_f32_s32, vcvtq_n_f16_u16, vcvtq_n_f32_u32, vcreateq_f16, vcreateq_f32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraint "Rd" is added, which checks the constant is with in the range of 1 to 16. Also a new predicate "mve_imm_16" is added, to check the the matching constraint Rd. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_NONE_NONE_NONE_QUALIFIERS): Define qualifier for binary operands. (BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vsubq_n_f16): Define macro. (vsubq_n_f32): Likewise. (vbrsrq_n_f16): Likewise. (vbrsrq_n_f32): Likewise. (vcvtq_n_f16_s16): Likewise. (vcvtq_n_f32_s32): Likewise. (vcvtq_n_f16_u16): Likewise. (vcvtq_n_f32_u32): Likewise. (vcreateq_f16): Likewise. (vcreateq_f32): Likewise. (__arm_vsubq_n_f16): Define intrinsic. (__arm_vsubq_n_f32): Likewise. (__arm_vbrsrq_n_f16): Likewise. (__arm_vbrsrq_n_f32): Likewise. (__arm_vcvtq_n_f16_s16): Likewise. (__arm_vcvtq_n_f32_s32): Likewise. (__arm_vcvtq_n_f16_u16): Likewise. (__arm_vcvtq_n_f32_u32): Likewise. (__arm_vcreateq_f16): Likewise. (__arm_vcreateq_f32): Likewise. (vsubq): Define polymorphic variant. (vbrsrq): Likewise. (vcvtq_n): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE_QUALIFIERS): Use it. (BINOP_NONE_NONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_IMM_QUALIFIERS): Likewise. (BINOP_NONE_UNONE_UNONE_QUALIFIERS): Likewise. * config/arm/constraints.md (Rd): Define constraint to check constant is in the range of 1 to 16. * config/arm/mve.md (mve_vsubq_n_f): Define RTL pattern. mve_vbrsrq_n_f: Likewise. mve_vcvtq_n_to_f_: Likewise. mve_vcreateq_f: Likewise. * config/arm/predicates.md (mve_imm_16): Define predicate to check the matching constraint Rd. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vbrsrq_n_f16.c: New test. * gcc.target/arm/mve/intrinsics/vbrsrq_n_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f16_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f16_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f32_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_f32_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsubq_n_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vsubq_n_f32.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index cd82aa159089c288607e240de02a85dcbb134a14..c2dad057d1365914477c64d559aa1fd1c32bbf19 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -349,6 +349,30 @@ arm_unop_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define UNOP_UNONE_IMM_QUALIFIERS \ (arm_unop_unone_imm_qualifiers) +static enum arm_type_qualifiers +arm_binop_none_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_none }; +#define BINOP_NONE_NONE_NONE_QUALIFIERS \ + (arm_binop_none_none_none_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_immediate }; +#define BINOP_NONE_NONE_IMM_QUALIFIERS \ + (arm_binop_none_none_imm_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_unone_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned, qualifier_immediate }; +#define BINOP_NONE_UNONE_IMM_QUALIFIERS \ + (arm_binop_none_unone_imm_qualifiers) + +static enum arm_type_qualifiers +arm_binop_none_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned, qualifier_unsigned }; +#define BINOP_NONE_UNONE_UNONE_QUALIFIERS \ + (arm_binop_none_unone_unone_qualifiers) + /* End of Qualifier for MVE builtins. */ /* void ([T element type] *, T, immediate). */ diff --git a/gcc/co
[PATCH][ARM][GCC][2/2x]: MVE intrinsics with binary operands.
Hello, This patch supports following MVE ACLE intrinsics with binary operands. vcvtq_n_s16_f16, vcvtq_n_s32_f32, vcvtq_n_u16_f16, vcvtq_n_u32_f32, vcreateq_u8, vcreateq_u16, vcreateq_u32, vcreateq_u64, vcreateq_s8, vcreateq_s16, vcreateq_s32, vcreateq_s64, vshrq_n_s8, vshrq_n_s16, vshrq_n_s32, vshrq_n_u8, vshrq_n_u16, vshrq_n_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rb" and "Rf" are added, which checks the constant is with in the range of 1 to 8 and 1 to 32 respectively. Also a new predicates "mve_imm_8" and "mve_imm_32" are added, to check the the matching constraint Rb and Rf respectively. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for binary operands. (BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vcvtq_n_s16_f16): Define macro. (vcvtq_n_s32_f32): Likewise. (vcvtq_n_u16_f16): Likewise. (vcvtq_n_u32_f32): Likewise. (vcreateq_u8): Likewise. (vcreateq_u16): Likewise. (vcreateq_u32): Likewise. (vcreateq_u64): Likewise. (vcreateq_s8): Likewise. (vcreateq_s16): Likewise. (vcreateq_s32): Likewise. (vcreateq_s64): Likewise. (vshrq_n_s8): Likewise. (vshrq_n_s16): Likewise. (vshrq_n_s32): Likewise. (vshrq_n_u8): Likewise. (vshrq_n_u16): Likewise. (vshrq_n_u32): Likewise. (__arm_vcreateq_u8): Define intrinsic. (__arm_vcreateq_u16): Likewise. (__arm_vcreateq_u32): Likewise. (__arm_vcreateq_u64): Likewise. (__arm_vcreateq_s8): Likewise. (__arm_vcreateq_s16): Likewise. (__arm_vcreateq_s32): Likewise. (__arm_vcreateq_s64): Likewise. (__arm_vshrq_n_s8): Likewise. (__arm_vshrq_n_s16): Likewise. (__arm_vshrq_n_s32): Likewise. (__arm_vshrq_n_u8): Likewise. (__arm_vshrq_n_u16): Likewise. (__arm_vshrq_n_u32): Likewise. (__arm_vcvtq_n_s16_f16): Likewise. (__arm_vcvtq_n_s32_f32): Likewise. (__arm_vcvtq_n_u16_f16): Likewise. (__arm_vcvtq_n_u32_f32): Likewise. (vshrq_n): Define polymorphic variant. * config/arm/arm_mve_builtins.def (BINOP_UNONE_UNONE_IMM_QUALIFIERS): Use it. (BINOP_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (BINOP_UNONE_NONE_IMM_QUALIFIERS): Likewise. * config/arm/constraints.md (Rb): Define constraint to check constant is in the range of 1 to 8. (Rf): Define constraint to check constant is in the range of 1 to 32. * config/arm/mve.md (mve_vcreateq_): Define RTL pattern. (mve_vshrq_n_): Likewise. (mve_vcvtq_n_from_f_): Likewise. * config/arm/predicates.md (mve_imm_8): Define predicate to check the matching constraint Rb. (mve_imm_32): Define predicate to check the matching constraint Rf. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vcreateq_s16.c: New test. * gcc.target/arm/mve/intrinsics/vcreateq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vcreateq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_s16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_s32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_u16_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcvtq_n_u32_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshrq_n_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index c2dad057d1365914477c64d559aa1fd1c32bbf19..ec56dbcdd2bb1de696142186955d75570772 100644 --- a/gcc/config/arm/arm-builtins.c +++ b
[PATCH][ARM][GCC][3/2x]: MVE intrinsics with binary operands.
Hello, This patch supports following MVE ACLE intrinsics with binary operands. vaddlvq_p_s32, vaddlvq_p_u32, vcmpneq_s8, vcmpneq_s16, vcmpneq_s32, vcmpneq_u8, vcmpneq_u16, vcmpneq_u32, vshlq_s8, vshlq_s16, vshlq_s32, vshlq_u8, vshlq_u16, vshlq_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (BINOP_NONE_NONE_UNONE_QUALIFIERS): Define qualifier for binary operands. (BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise. (BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vaddlvq_p_s32): Define macro. (vaddlvq_p_u32): Likewise. (vcmpneq_s8): Likewise. (vcmpneq_s16): Likewise. (vcmpneq_s32): Likewise. (vcmpneq_u8): Likewise. (vcmpneq_u16): Likewise. (vcmpneq_u32): Likewise. (vshlq_s8): Likewise. (vshlq_s16): Likewise. (vshlq_s32): Likewise. (vshlq_u8): Likewise. (vshlq_u16): Likewise. (vshlq_u32): Likewise. (__arm_vaddlvq_p_s32): Define intrinsic. (__arm_vaddlvq_p_u32): Likewise. (__arm_vcmpneq_s8): Likewise. (__arm_vcmpneq_s16): Likewise. (__arm_vcmpneq_s32): Likewise. (__arm_vcmpneq_u8): Likewise. (__arm_vcmpneq_u16): Likewise. (__arm_vcmpneq_u32): Likewise. (__arm_vshlq_s8): Likewise. (__arm_vshlq_s16): Likewise. (__arm_vshlq_s32): Likewise. (__arm_vshlq_u8): Likewise. (__arm_vshlq_u16): Likewise. (__arm_vshlq_u32): Likewise. (vaddlvq_p): Define polymorphic variant. (vcmpneq): Likewise. (vshlq): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_UNONE_QUALIFIERS): Use it. (BINOP_UNONE_NONE_NONE_QUALIFIERS): Likewise. (BINOP_UNONE_UNONE_NONE_QUALIFIERS): Likewise. * config/arm/mve.md (mve_vaddlvq_p_v4si): Define RTL pattern. (mve_vcmpneq_): Likewise. (mve_vshlq_): Likewise. gcc/testsuite/ChangeLog: 2019-10-21 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: New test. * gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vcmpneq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vshlq_u8.c: Likewise. ### Attachment also inlined for ease of reply### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index ec56dbcdd2bb1de696142186955d75570772..73a0a3070bda2e35f3e400994dbb6ccfb3043f76 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -391,6 +391,24 @@ arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define BINOP_UNONE_NONE_IMM_QUALIFIERS \ (arm_binop_unone_none_imm_qualifiers) +static enum arm_type_qualifiers +arm_binop_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_none, qualifier_unsigned }; +#define BINOP_NONE_NONE_UNONE_QUALIFIERS \ + (arm_binop_none_none_unone_qualifiers) + +static enum arm_type_qualifiers +arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_none, qualifier_none }; +#define BINOP_UNONE_NONE_NONE_QUALIFIERS \ + (arm_binop_unone_none_none_qualifiers) + +static enum arm_type_qualifiers +arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_none }; +#define BINOP_UNONE_UNONE_NONE_QUALIFIERS \ + (arm_binop_unone_unone_none_qualifiers) + /* End of Qualifier for MVE builtins. */ /* void ([T element type] *, T, immediate). */ diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 0318a2cd8c3cdea430a32506e75a5aceb4d3a475..0140b0cb38ccee5f9caa7570627bdc010e158e71 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -225,6 +225,20 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vshrq_n_u8(__a
[PATCH][ARM][GCC][5/2x]: MVE intrinsics with binary operands.
Hello, This patch supports following MVE ACLE intrinsics with binary operands. vqmovntq_u16, vqmovnbq_u16, vmulltq_poly_p8, vmullbq_poly_p8, vmovntq_u16, vmovnbq_u16, vmlaldavxq_u16, vmlaldavq_u16, vqmovuntq_s16, vqmovunbq_s16, vshlltq_n_u8, vshllbq_n_u8, vorrq_n_u16, vbicq_n_u16, vcmpneq_n_f16, vcmpneq_f16, vcmpltq_n_f16, vcmpltq_f16, vcmpleq_n_f16, vcmpleq_f16, vcmpgtq_n_f16, vcmpgtq_f16, vcmpgeq_n_f16, vcmpgeq_f16, vcmpeqq_n_f16, vcmpeqq_f16, vsubq_f16, vqmovntq_s16, vqmovnbq_s16, vqdmulltq_s16, vqdmulltq_n_s16, vqdmullbq_s16, vqdmullbq_n_s16, vorrq_f16, vornq_f16, vmulq_n_f16, vmulq_f16, vmovntq_s16, vmovnbq_s16, vmlsldavxq_s16, vmlsldavq_s16, vmlaldavxq_s16, vmlaldavq_s16, vminnmvq_f16, vminnmq_f16, vminnmavq_f16, vminnmaq_f16, vmaxnmvq_f16, vmaxnmq_f16, vmaxnmavq_f16, vmaxnmaq_f16, veorq_f16, vcmulq_rot90_f16, vcmulq_rot270_f16, vcmulq_rot180_f16, vcmulq_f16, vcaddq_rot90_f16, vcaddq_rot270_f16, vbicq_f16, vandq_f16, vaddq_n_f16, vabdq_f16, vshlltq_n_s8, vshllbq_n_s8, vorrq_n_s16, vbicq_n_s16, vqmovntq_u32, vqmovnbq_u32, vmulltq_poly_p16, vmullbq_poly_p16, vmovntq_u32, vmovnbq_u32, vmlaldavxq_u32, vmlaldavq_u32, vqmovuntq_s32, vqmovunbq_s32, vshlltq_n_u16, vshllbq_n_u16, vorrq_n_u32, vbicq_n_u32, vcmpneq_n_f32, vcmpneq_f32, vcmpltq_n_f32, vcmpltq_f32, vcmpleq_n_f32, vcmpleq_f32, vcmpgtq_n_f32, vcmpgtq_f32, vcmpgeq_n_f32, vcmpgeq_f32, vcmpeqq_n_f32, vcmpeqq_f32, vsubq_f32, vqmovntq_s32, vqmovnbq_s32, vqdmulltq_s32, vqdmulltq_n_s32, vqdmullbq_s32, vqdmullbq_n_s32, vorrq_f32, vornq_f32, vmulq_n_f32, vmulq_f32, vmovntq_s32, vmovnbq_s32, vmlsldavxq_s32, vmlsldavq_s32, vmlaldavxq_s32, vmlaldavq_s32, vminnmvq_f32, vminnmq_f32, vminnmavq_f32, vminnmaq_f32, vmaxnmvq_f32, vmaxnmq_f32, vmaxnmavq_f32, vmaxnmaq_f32, veorq_f32, vcmulq_rot90_f32, vcmulq_rot270_f32, vcmulq_rot180_f32, vcmulq_f32, vcaddq_rot90_f32, vcaddq_rot270_f32, vbicq_f32, vandq_f32, vaddq_n_f32, vabdq_f32, vshlltq_n_s16, vshllbq_n_s16, vorrq_n_s32, vbicq_n_s32, vrmlaldavhq_u32, vctp8q_m, vctp64q_m, vctp32q_m, vctp16q_m, vaddlvaq_u32, vrmlsldavhxq_s32, vrmlsldavhq_s32, vrmlaldavhxq_s32, vrmlaldavhq_s32, vcvttq_f16_f32, vcvtbq_f16_f32, vaddlvaq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics The above intrinsics are defined using the already defined builtin qualifiers BINOP_NONE_NONE_IMM, BINOP_NONE_NONE_NONE, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, BINOP_UNONE_UNONE_NONE, BINOP_UNONE_UNONE_UNONE. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vqmovntq_u16): Define macro. (vqmovnbq_u16): Likewise. (vmulltq_poly_p8): Likewise. (vmullbq_poly_p8): Likewise. (vmovntq_u16): Likewise. (vmovnbq_u16): Likewise. (vmlaldavxq_u16): Likewise. (vmlaldavq_u16): Likewise. (vqmovuntq_s16): Likewise. (vqmovunbq_s16): Likewise. (vshlltq_n_u8): Likewise. (vshllbq_n_u8): Likewise. (vorrq_n_u16): Likewise. (vbicq_n_u16): Likewise. (vcmpneq_n_f16): Likewise. (vcmpneq_f16): Likewise. (vcmpltq_n_f16): Likewise. (vcmpltq_f16): Likewise. (vcmpleq_n_f16): Likewise. (vcmpleq_f16): Likewise. (vcmpgtq_n_f16): Likewise. (vcmpgtq_f16): Likewise. (vcmpgeq_n_f16): Likewise. (vcmpgeq_f16): Likewise. (vcmpeqq_n_f16): Likewise. (vcmpeqq_f16): Likewise. (vsubq_f16): Likewise. (vqmovntq_s16): Likewise. (vqmovnbq_s16): Likewise. (vqdmulltq_s16): Likewise. (vqdmulltq_n_s16): Likewise. (vqdmullbq_s16): Likewise. (vqdmullbq_n_s16): Likewise. (vorrq_f16): Likewise. (vornq_f16): Likewise. (vmulq_n_f16): Likewise. (vmulq_f16): Likewise. (vmovntq_s16): Likewise. (vmovnbq_s16): Likewise. (vmlsldavxq_s16): Likewise. (vmlsldavq_s16): Likewise. (vmlaldavxq_s16): Likewise. (vmlaldavq_s16): Likewise. (vminnmvq_f16): Likewise. (vminnmq_f16): Likewise. (vminnmavq_f16): Likewise. (vminnmaq_f16): Likewise. (vmaxnmvq_f16): Likewise. (vmaxnmq_f16): Likewise. (vmaxnmavq_f16): Likewise. (vmaxnmaq_f16): Likewise. (veorq_f16): Likewise. (vcmulq_rot90_f16): Likewise. (vcmulq_rot270_f16): Likewise. (vcmulq_rot180_f16): Likewise. (vcmulq_f16): Likewise. (vcaddq_rot90_f16): Likewise. (vcaddq_rot270_f16): Likewise. (vbicq_f16): Likewise. (vandq_f16): Likewise. (vaddq_n_f16): Likewise. (vabdq_f16): Likewise. (vshlltq_n_s8): Likewise. (vshllbq_n_s8): Likewise
[PATCH][ARM][GCC][4/2x]: MVE intrinsics with binary operands.
-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Ra" and "Rg" are added. Ra checks the constant is with in the range of 0 to 7 where as Rg checks that the constant is one among 1, 2, 4 and 8. Also a new predicates "mve_imm_7" and "mve_imm_selective_upto_8" are added, to check the the matching constraint Ra and Rg respectively. The above intrinsics are defined using the already defined builtin qualifiers BINOP_NONE_NONE_IMM, BINOP_NONE_NONE_NONE, BINOP_NONE_NONE_UNONE, BINOP_UNONE_NONE_IMM, BINOP_UNONE_NONE_NONE, BINOP_UNONE_UNONE_IMM, BINOP_UNONE_UNONE_NONE, BINOP_UNONE_UNONE_UNONE. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vsubq_u8): Define macro. (vsubq_n_u8): Likewise. (vrmulhq_u8): Likewise. (vrhaddq_u8): Likewise. (vqsubq_u8): Likewise. (vqsubq_n_u8): Likewise. (vqaddq_u8): Likewise. (vqaddq_n_u8): Likewise. (vorrq_u8): Likewise. (vornq_u8): Likewise. (vmulq_u8): Likewise. (vmulq_n_u8): Likewise. (vmulltq_int_u8): Likewise. (vmullbq_int_u8): Likewise. (vmulhq_u8): Likewise. (vmladavq_u8): Likewise. (vminvq_u8): Likewise. (vminq_u8): Likewise. (vmaxvq_u8): Likewise. (vmaxq_u8): Likewise. (vhsubq_u8): Likewise. (vhsubq_n_u8): Likewise. (vhaddq_u8): Likewise. (vhaddq_n_u8): Likewise. (veorq_u8): Likewise. (vcmpneq_n_u8): Likewise. (vcmphiq_u8): Likewise. (vcmphiq_n_u8): Likewise. (vcmpeqq_u8): Likewise. (vcmpeqq_n_u8): Likewise. (vcmpcsq_u8): Likewise. (vcmpcsq_n_u8): Likewise. (vcaddq_rot90_u8): Likewise. (vcaddq_rot270_u8): Likewise. (vbicq_u8): Likewise. (vandq_u8): Likewise. (vaddvq_p_u8): Likewise. (vaddvaq_u8): Likewise. (vaddq_n_u8): Likewise. (vabdq_u8): Likewise. (vshlq_r_u8): Likewise. (vrshlq_u8): Likewise. (vrshlq_n_u8): Likewise. (vqshlq_u8): Likewise. (vqshlq_r_u8): Likewise. (vqrshlq_u8): Likewise. (vqrshlq_n_u8): Likewise. (vminavq_s8): Likewise. (vminaq_s8): Likewise. (vmaxavq_s8): Likewise. (vmaxaq_s8): Likewise. (vbrsrq_n_u8): Likewise. (vshlq_n_u8): Likewise. (vrshrq_n_u8): Likewise. (vqshlq_n_u8): Likewise. (vcmpneq_n_s8): Likewise. (vcmpltq_s8): Likewise. (vcmpltq_n_s8): Likewise. (vcmpleq_s8): Likewise. (vcmpleq_n_s8): Likewise. (vcmpgtq_s8): Likewise. (vcmpgtq_n_s8): Likewise. (vcmpgeq_s8): Likewise. (vcmpgeq_n_s8): Likewise. (vcmpeqq_s8): Likewise. (vcmpeqq_n_s8): Likewise. (vqshluq_n_s8): Likewise. (vaddvq_p_s8): Likewise. (vsubq_s8): Likewise. (vsubq_n_s8): Likewise. (vshlq_r_s8): Likewise. (vrshlq_s8): Likewise. (vrshlq_n_s8): Likewise. (vrmulhq_s8): Likewise. (vrhaddq_s8): Likewise. (vqsubq_s8): Likewise. (vqsubq_n_s8): Likewise. (vqshlq_s8): Likewise. (vqshlq_r_s8): Likewise. (vqrshlq_s8): Likewise. (vqrshlq_n_s8): Likewise. (vqrdmulhq_s8): Likewise. (vqrdmulhq_n_s8): Likewise. (vqdmulhq_s8): Likewise. (vqdmulhq_n_s8): Likewise. (vqaddq_s8): Likewise. (vqaddq_n_s8): Likewise. (vorrq_s8): Likewise. (vornq_s8): Likewise. (vmulq_s8): Likewise. (vmulq_n_s8): Likewise. (vmulltq_int_s8): Likewise. (vmullbq_int_s8): Likewise. (vmulhq_s8): Likewise. (vmlsdavxq_s8): Likewise. (vmlsdavq_s8): Likewise. (vmladavxq_s8): Likewise. (vmladavq_s8): Likewise. (vminvq_s8): Likewise. (vminq_s8): Likewise. (vmaxvq_s8): Likewise. (vmaxq_s8): Likewise. (vhsubq_s8): Likewise. (vhsubq_n_s8): Likewise. (vhcaddq_rot90_s8): Likewise. (vhcaddq_rot270_s8): Likewise. (vhaddq_s8): Likewise. (vhaddq_n_s8): Likewise. (veorq_s8): Likewise. (vcaddq_rot90_s8): Likewise. (vcaddq_rot270_s8): Likewise. (vbrsrq_n_s8): Likewise. (vbicq_s8): Likewise. (vandq_s8): Likewise. (vaddvaq_s8): Likewise. (vaddq_n_s8): Likewise. (vabdq_s8): Likewise. (vshlq_n_s8): Likewise. (vrshrq_n_s8): Likewise. (vqshlq_n_s8): Likewise. (vsubq_u16): Likewise. (vsubq_n_u16): Likewise. (vrmulhq_u16): Likewise. (vrhaddq_u16): Likewise. (vqsubq_u16): Likewise. (vqsubq_n_u16): Likewise. (vqaddq_u16): Li
[PATCH][ARM][GCC][3/3x]: MVE intrinsics with ternary operands.
Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vrmlaldavhaxq_s32, vrmlsldavhaq_s32, vrmlsldavhaxq_s32, vaddlvaq_p_s32, vcvtbq_m_f16_f32, vcvtbq_m_f32_f16, vcvttq_m_f16_f32, vcvttq_m_f32_f16, vrev16q_m_s8, vrev32q_m_f16, vrmlaldavhq_p_s32, vrmlaldavhxq_p_s32, vrmlsldavhq_p_s32, vrmlsldavhxq_p_s32, vaddlvaq_p_u32, vrev16q_m_u8, vrmlaldavhq_p_u32, vmvnq_m_n_s16, vorrq_m_n_s16, vqrshrntq_n_s16, vqshrnbq_n_s16, vqshrntq_n_s16, vrshrnbq_n_s16, vrshrntq_n_s16, vshrnbq_n_s16, vshrntq_n_s16, vcmlaq_f16, vcmlaq_rot180_f16, vcmlaq_rot270_f16, vcmlaq_rot90_f16, vfmaq_f16, vfmaq_n_f16, vfmasq_n_f16, vfmsq_f16, vmlaldavaq_s16, vmlaldavaxq_s16, vmlsldavaq_s16, vmlsldavaxq_s16, vabsq_m_f16, vcvtmq_m_s16_f16, vcvtnq_m_s16_f16, vcvtpq_m_s16_f16, vcvtq_m_s16_f16, vdupq_m_n_f16, vmaxnmaq_m_f16, vmaxnmavq_p_f16, vmaxnmvq_p_f16, vminnmaq_m_f16, vminnmavq_p_f16, vminnmvq_p_f16, vmlaldavq_p_s16, vmlaldavxq_p_s16, vmlsldavq_p_s16, vmlsldavxq_p_s16, vmovlbq_m_s8, vmovltq_m_s8, vmovnbq_m_s16, vmovntq_m_s16, vnegq_m_f16, vpselq_f16, vqmovnbq_m_s16, vqmovntq_m_s16, vrev32q_m_s8, vrev64q_m_f16, vrndaq_m_f16, vrndmq_m_f16, vrndnq_m_f16, vrndpq_m_f16, vrndq_m_f16, vrndxq_m_f16, vcmpeqq_m_n_f16, vcmpgeq_m_f16, vcmpgeq_m_n_f16, vcmpgtq_m_f16, vcmpgtq_m_n_f16, vcmpleq_m_f16, vcmpleq_m_n_f16, vcmpltq_m_f16, vcmpltq_m_n_f16, vcmpneq_m_f16, vcmpneq_m_n_f16, vmvnq_m_n_u16, vorrq_m_n_u16, vqrshruntq_n_s16, vqshrunbq_n_s16, vqshruntq_n_s16, vcvtmq_m_u16_f16, vcvtnq_m_u16_f16, vcvtpq_m_u16_f16, vcvtq_m_u16_f16, vqmovunbq_m_s16, vqmovuntq_m_s16, vqrshrntq_n_u16, vqshrnbq_n_u16, vqshrntq_n_u16, vrshrnbq_n_u16, vrshrntq_n_u16, vshrnbq_n_u16, vshrntq_n_u16, vmlaldavaq_u16, vmlaldavaxq_u16, vmlaldavq_p_u16, vmlaldavxq_p_u16, vmovlbq_m_u8, vmovltq_m_u8, vmovnbq_m_u16, vmovntq_m_u16, vqmovnbq_m_u16, vqmovntq_m_u16, vrev32q_m_u8, vmvnq_m_n_s32, vorrq_m_n_s32, vqrshrntq_n_s32, vqshrnbq_n_s32, vqshrntq_n_s32, vrshrnbq_n_s32, vrshrntq_n_s32, vshrnbq_n_s32, vshrntq_n_s32, vcmlaq_f32, vcmlaq_rot180_f32, vcmlaq_rot270_f32, vcmlaq_rot90_f32, vfmaq_f32, vfmaq_n_f32, vfmasq_n_f32, vfmsq_f32, vmlaldavaq_s32, vmlaldavaxq_s32, vmlsldavaq_s32, vmlsldavaxq_s32, vabsq_m_f32, vcvtmq_m_s32_f32, vcvtnq_m_s32_f32, vcvtpq_m_s32_f32, vcvtq_m_s32_f32, vdupq_m_n_f32, vmaxnmaq_m_f32, vmaxnmavq_p_f32, vmaxnmvq_p_f32, vminnmaq_m_f32, vminnmavq_p_f32, vminnmvq_p_f32, vmlaldavq_p_s32, vmlaldavxq_p_s32, vmlsldavq_p_s32, vmlsldavxq_p_s32, vmovlbq_m_s16, vmovltq_m_s16, vmovnbq_m_s32, vmovntq_m_s32, vnegq_m_f32, vpselq_f32, vqmovnbq_m_s32, vqmovntq_m_s32, vrev32q_m_s16, vrev64q_m_f32, vrndaq_m_f32, vrndmq_m_f32, vrndnq_m_f32, vrndpq_m_f32, vrndq_m_f32, vrndxq_m_f32, vcmpeqq_m_n_f32, vcmpgeq_m_f32, vcmpgeq_m_n_f32, vcmpgtq_m_f32, vcmpgtq_m_n_f32, vcmpleq_m_f32, vcmpleq_m_n_f32, vcmpltq_m_f32, vcmpltq_m_n_f32, vcmpneq_m_f32, vcmpneq_m_n_f32, vmvnq_m_n_u32, vorrq_m_n_u32, vqrshruntq_n_s32, vqshrunbq_n_s32, vqshruntq_n_s32, vcvtmq_m_u32_f32, vcvtnq_m_u32_f32, vcvtpq_m_u32_f32, vcvtq_m_u32_f32, vqmovunbq_m_s32, vqmovuntq_m_s32, vqrshrntq_n_u32, vqshrnbq_n_u32, vqshrntq_n_u32, vrshrnbq_n_u32, vrshrntq_n_u32, vshrnbq_n_u32, vshrntq_n_u32, vmlaldavaq_u32, vmlaldavaxq_u32, vmlaldavq_p_u32, vmlaldavxq_p_u32, vmovlbq_m_u16, vmovltq_m_u16, vmovnbq_m_u32, vmovntq_m_u32, vqmovnbq_m_u32, vqmovntq_m_u32, vrev32q_m_u16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-29 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vrmlaldavhaxq_s32): Define macro. (vrmlsldavhaq_s32): Likewise. (vrmlsldavhaxq_s32): Likewise. (vaddlvaq_p_s32): Likewise. (vcvtbq_m_f16_f32): Likewise. (vcvtbq_m_f32_f16): Likewise. (vcvttq_m_f16_f32): Likewise. (vcvttq_m_f32_f16): Likewise. (vrev16q_m_s8): Likewise. (vrev32q_m_f16): Likewise. (vrmlaldavhq_p_s32): Likewise. (vrmlaldavhxq_p_s32): Likewise. (vrmlsldavhq_p_s32): Likewise. (vrmlsldavhxq_p_s32): Likewise. (vaddlvaq_p_u32): Likewise. (vrev16q_m_u8): Likewise. (vrmlaldavhq_p_u32): Likewise. (vmvnq_m_n_s16): Likewise. (vorrq_m_n_s16): Likewise. (vqrshrntq_n_s16): Likewise. (vqshrnbq_n_s16): Likewise. (vqshrntq_n_s16): Likewise. (vrshrnbq_n_s16): Likewise. (vrshrntq_n_s16): Likewise. (vshrnbq_n_s16): Likewise. (vshrntq_n_s16): Likewise. (vcmlaq_f16): Likewise. (vcmlaq_rot180_f16): Likewise. (vcmlaq_rot270_f16): Likewise. (vcmlaq_rot90_f16): Likewise. (vfmaq_f16): Likewise. (vfmaq_n_f16): Likewise. (vfmasq_n_f16): Likewise. (vfmsq_f16
[PATCH][ARM][GCC][2/3x]: MVE intrinsics with ternary operands.
Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vpselq_u8, vpselq_s8, vrev64q_m_u8, vqrdmlashq_n_u8, vqrdmlahq_n_u8, vqdmlahq_n_u8, vmvnq_m_u8, vmlasq_n_u8, vmlaq_n_u8, vmladavq_p_u8, vmladavaq_u8, vminvq_p_u8, vmaxvq_p_u8, vdupq_m_n_u8, vcmpneq_m_u8, vcmpneq_m_n_u8, vcmphiq_m_u8, vcmphiq_m_n_u8, vcmpeqq_m_u8, vcmpeqq_m_n_u8, vcmpcsq_m_u8, vcmpcsq_m_n_u8, vclzq_m_u8, vaddvaq_p_u8, vsriq_n_u8, vsliq_n_u8, vshlq_m_r_u8, vrshlq_m_n_u8, vqshlq_m_r_u8, vqrshlq_m_n_u8, vminavq_p_s8, vminaq_m_s8, vmaxavq_p_s8, vmaxaq_m_s8, vcmpneq_m_s8, vcmpneq_m_n_s8, vcmpltq_m_s8, vcmpltq_m_n_s8, vcmpleq_m_s8, vcmpleq_m_n_s8, vcmpgtq_m_s8, vcmpgtq_m_n_s8, vcmpgeq_m_s8, vcmpgeq_m_n_s8, vcmpeqq_m_s8, vcmpeqq_m_n_s8, vshlq_m_r_s8, vrshlq_m_n_s8, vrev64q_m_s8, vqshlq_m_r_s8, vqrshlq_m_n_s8, vqnegq_m_s8, vqabsq_m_s8, vnegq_m_s8, vmvnq_m_s8, vmlsdavxq_p_s8, vmlsdavq_p_s8, vmladavxq_p_s8, vmladavq_p_s8, vminvq_p_s8, vmaxvq_p_s8, vdupq_m_n_s8, vclzq_m_s8, vclsq_m_s8, vaddvaq_p_s8, vabsq_m_s8, vqrdmlsdhxq_s8, vqrdmlsdhq_s8, vqrdmlashq_n_s8, vqrdmlahq_n_s8, vqrdmladhxq_s8, vqrdmladhq_s8, vqdmlsdhxq_s8, vqdmlsdhq_s8, vqdmlahq_n_s8, vqdmladhxq_s8, vqdmladhq_s8, vmlsdavaxq_s8, vmlsdavaq_s8, vmlasq_n_s8, vmlaq_n_s8, vmladavaxq_s8, vmladavaq_s8, vsriq_n_s8, vsliq_n_s8, vpselq_u16, vpselq_s16, vrev64q_m_u16, vqrdmlashq_n_u16, vqrdmlahq_n_u16, vqdmlahq_n_u16, vmvnq_m_u16, vmlasq_n_u16, vmlaq_n_u16, vmladavq_p_u16, vmladavaq_u16, vminvq_p_u16, vmaxvq_p_u16, vdupq_m_n_u16, vcmpneq_m_u16, vcmpneq_m_n_u16, vcmphiq_m_u16, vcmphiq_m_n_u16, vcmpeqq_m_u16, vcmpeqq_m_n_u16, vcmpcsq_m_u16, vcmpcsq_m_n_u16, vclzq_m_u16, vaddvaq_p_u16, vsriq_n_u16, vsliq_n_u16, vshlq_m_r_u16, vrshlq_m_n_u16, vqshlq_m_r_u16, vqrshlq_m_n_u16, vminavq_p_s16, vminaq_m_s16, vmaxavq_p_s16, vmaxaq_m_s16, vcmpneq_m_s16, vcmpneq_m_n_s16, vcmpltq_m_s16, vcmpltq_m_n_s16, vcmpleq_m_s16, vcmpleq_m_n_s16, vcmpgtq_m_s16, vcmpgtq_m_n_s16, vcmpgeq_m_s16, vcmpgeq_m_n_s16, vcmpeqq_m_s16, vcmpeqq_m_n_s16, vshlq_m_r_s16, vrshlq_m_n_s16, vrev64q_m_s16, vqshlq_m_r_s16, vqrshlq_m_n_s16, vqnegq_m_s16, vqabsq_m_s16, vnegq_m_s16, vmvnq_m_s16, vmlsdavxq_p_s16, vmlsdavq_p_s16, vmladavxq_p_s16, vmladavq_p_s16, vminvq_p_s16, vmaxvq_p_s16, vdupq_m_n_s16, vclzq_m_s16, vclsq_m_s16, vaddvaq_p_s16, vabsq_m_s16, vqrdmlsdhxq_s16, vqrdmlsdhq_s16, vqrdmlashq_n_s16, vqrdmlahq_n_s16, vqrdmladhxq_s16, vqrdmladhq_s16, vqdmlsdhxq_s16, vqdmlsdhq_s16, vqdmlahq_n_s16, vqdmladhxq_s16, vqdmladhq_s16, vmlsdavaxq_s16, vmlsdavaq_s16, vmlasq_n_s16, vmlaq_n_s16, vmladavaxq_s16, vmladavaq_s16, vsriq_n_s16, vsliq_n_s16, vpselq_u32, vpselq_s32, vrev64q_m_u32, vqrdmlashq_n_u32, vqrdmlahq_n_u32, vqdmlahq_n_u32, vmvnq_m_u32, vmlasq_n_u32, vmlaq_n_u32, vmladavq_p_u32, vmladavaq_u32, vminvq_p_u32, vmaxvq_p_u32, vdupq_m_n_u32, vcmpneq_m_u32, vcmpneq_m_n_u32, vcmphiq_m_u32, vcmphiq_m_n_u32, vcmpeqq_m_u32, vcmpeqq_m_n_u32, vcmpcsq_m_u32, vcmpcsq_m_n_u32, vclzq_m_u32, vaddvaq_p_u32, vsriq_n_u32, vsliq_n_u32, vshlq_m_r_u32, vrshlq_m_n_u32, vqshlq_m_r_u32, vqrshlq_m_n_u32, vminavq_p_s32, vminaq_m_s32, vmaxavq_p_s32, vmaxaq_m_s32, vcmpneq_m_s32, vcmpneq_m_n_s32, vcmpltq_m_s32, vcmpltq_m_n_s32, vcmpleq_m_s32, vcmpleq_m_n_s32, vcmpgtq_m_s32, vcmpgtq_m_n_s32, vcmpgeq_m_s32, vcmpgeq_m_n_s32, vcmpeqq_m_s32, vcmpeqq_m_n_s32, vshlq_m_r_s32, vrshlq_m_n_s32, vrev64q_m_s32, vqshlq_m_r_s32, vqrshlq_m_n_s32, vqnegq_m_s32, vqabsq_m_s32, vnegq_m_s32, vmvnq_m_s32, vmlsdavxq_p_s32, vmlsdavq_p_s32, vmladavxq_p_s32, vmladavq_p_s32, vminvq_p_s32, vmaxvq_p_s32, vdupq_m_n_s32, vclzq_m_s32, vclsq_m_s32, vaddvaq_p_s32, vabsq_m_s32, vqrdmlsdhxq_s32, vqrdmlsdhq_s32, vqrdmlashq_n_s32, vqrdmlahq_n_s32, vqrdmladhxq_s32, vqrdmladhq_s32, vqdmlsdhxq_s32, vqdmlsdhq_s32, vqdmlahq_n_s32, vqdmladhxq_s32, vqdmladhq_s32, vmlsdavaxq_s32, vmlsdavaq_s32, vmlasq_n_s32, vmlaq_n_s32, vmladavaxq_s32, vmladavaq_s32, vsriq_n_s32, vsliq_n_s32, vpselq_u64, vpselq_s64. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics In this patch new constraints "Rc" and "Re" are added, which checks the constant is with in the range of 0 to 15 and 0 to 31 respectively. Also a new predicates "mve_imm_15" and "mve_imm_31" are added, to check the the matching constraint Rc and Re respectively. Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-25 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vpselq_u8): Define macro. (vpselq_s8): Likewise. (vrev64q_m_u8): Likewise. (vqrdmlashq_n_u8): Likewise. (vqrdmlahq_n_u8): Likewise. (vqdmlahq_n_u8): Likewise. (vmvnq_m_u8): Likewise. (vmlasq_n_u8): Likewise. (vmlaq_n_u8): Likewise. (vmladavq_p_u8): Likewise. (vmladavaq_u8): Likewise. (vmin
[PATCH][ARM][GCC][3/4x]: MVE intrinsics with quaternary operands.
Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vmlaldavaq_p_s16, vmlaldavaq_p_s32, vmlaldavaq_p_u16, vmlaldavaq_p_u32, vmlaldavaxq_p_s16, vmlaldavaxq_p_s32, vmlaldavaxq_p_u16, vmlaldavaxq_p_u32, vmlsldavaq_p_s16, vmlsldavaq_p_s32, vmlsldavaxq_p_s16, vmlsldavaxq_p_s32, vmullbq_poly_m_p16, vmullbq_poly_m_p8, vmulltq_poly_m_p16, vmulltq_poly_m_p8, vqdmullbq_m_n_s16, vqdmullbq_m_n_s32, vqdmullbq_m_s16, vqdmullbq_m_s32, vqdmulltq_m_n_s16, vqdmulltq_m_n_s32, vqdmulltq_m_s16, vqdmulltq_m_s32, vqrshrnbq_m_n_s16, vqrshrnbq_m_n_s32, vqrshrnbq_m_n_u16, vqrshrnbq_m_n_u32, vqrshrntq_m_n_s16, vqrshrntq_m_n_s32, vqrshrntq_m_n_u16, vqrshrntq_m_n_u32, vqrshrunbq_m_n_s16, vqrshrunbq_m_n_s32, vqrshruntq_m_n_s16, vqrshruntq_m_n_s32, vqshrnbq_m_n_s16, vqshrnbq_m_n_s32, vqshrnbq_m_n_u16, vqshrnbq_m_n_u32, vqshrntq_m_n_s16, vqshrntq_m_n_s32, vqshrntq_m_n_u16, vqshrntq_m_n_u32, vqshrunbq_m_n_s16, vqshrunbq_m_n_s32, vqshruntq_m_n_s16, vqshruntq_m_n_s32, vrmlaldavhaq_p_s32, vrmlaldavhaq_p_u32, vrmlaldavhaxq_p_s32, vrmlsldavhaq_p_s32, vrmlsldavhaxq_p_s32, vrshrnbq_m_n_s16, vrshrnbq_m_n_s32, vrshrnbq_m_n_u16, vrshrnbq_m_n_u32, vrshrntq_m_n_s16, vrshrntq_m_n_s32, vrshrntq_m_n_u16, vrshrntq_m_n_u32, vshllbq_m_n_s16, vshllbq_m_n_s8, vshllbq_m_n_u16, vshllbq_m_n_u8, vshlltq_m_n_s16, vshlltq_m_n_s8, vshlltq_m_n_u16, vshlltq_m_n_u8, vshrnbq_m_n_s16, vshrnbq_m_n_s32, vshrnbq_m_n_u16, vshrnbq_m_n_u32, vshrntq_m_n_s16, vshrntq_m_n_s32, vshrntq_m_n_u16, vshrntq_m_n_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-31 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-protos.h (arm_mve_immediate_check): * config/arm/arm.c (arm_mve_immediate_check): Define fuction to check mode and interger value. * config/arm/arm_mve.h (vmlaldavaq_p_s32): Define macro. (vmlaldavaq_p_s16): Likewise. (vmlaldavaq_p_u32): Likewise. (vmlaldavaq_p_u16): Likewise. (vmlaldavaxq_p_s32): Likewise. (vmlaldavaxq_p_s16): Likewise. (vmlaldavaxq_p_u32): Likewise. (vmlaldavaxq_p_u16): Likewise. (vmlsldavaq_p_s32): Likewise. (vmlsldavaq_p_s16): Likewise. (vmlsldavaxq_p_s32): Likewise. (vmlsldavaxq_p_s16): Likewise. (vmullbq_poly_m_p8): Likewise. (vmullbq_poly_m_p16): Likewise. (vmulltq_poly_m_p8): Likewise. (vmulltq_poly_m_p16): Likewise. (vqdmullbq_m_n_s32): Likewise. (vqdmullbq_m_n_s16): Likewise. (vqdmullbq_m_s32): Likewise. (vqdmullbq_m_s16): Likewise. (vqdmulltq_m_n_s32): Likewise. (vqdmulltq_m_n_s16): Likewise. (vqdmulltq_m_s32): Likewise. (vqdmulltq_m_s16): Likewise. (vqrshrnbq_m_n_s32): Likewise. (vqrshrnbq_m_n_s16): Likewise. (vqrshrnbq_m_n_u32): Likewise. (vqrshrnbq_m_n_u16): Likewise. (vqrshrntq_m_n_s32): Likewise. (vqrshrntq_m_n_s16): Likewise. (vqrshrntq_m_n_u32): Likewise. (vqrshrntq_m_n_u16): Likewise. (vqrshrunbq_m_n_s32): Likewise. (vqrshrunbq_m_n_s16): Likewise. (vqrshruntq_m_n_s32): Likewise. (vqrshruntq_m_n_s16): Likewise. (vqshrnbq_m_n_s32): Likewise. (vqshrnbq_m_n_s16): Likewise. (vqshrnbq_m_n_u32): Likewise. (vqshrnbq_m_n_u16): Likewise. (vqshrntq_m_n_s32): Likewise. (vqshrntq_m_n_s16): Likewise. (vqshrntq_m_n_u32): Likewise. (vqshrntq_m_n_u16): Likewise. (vqshrunbq_m_n_s32): Likewise. (vqshrunbq_m_n_s16): Likewise. (vqshruntq_m_n_s32): Likewise. (vqshruntq_m_n_s16): Likewise. (vrmlaldavhaq_p_s32): Likewise. (vrmlaldavhaq_p_u32): Likewise. (vrmlaldavhaxq_p_s32): Likewise. (vrmlsldavhaq_p_s32): Likewise. (vrmlsldavhaxq_p_s32): Likewise. (vrshrnbq_m_n_s32): Likewise. (vrshrnbq_m_n_s16): Likewise. (vrshrnbq_m_n_u32): Likewise. (vrshrnbq_m_n_u16): Likewise. (vrshrntq_m_n_s32): Likewise. (vrshrntq_m_n_s16): Likewise. (vrshrntq_m_n_u32): Likewise. (vrshrntq_m_n_u16): Likewise. (vshllbq_m_n_s8): Likewise. (vshllbq_m_n_s16): Likewise. (vshllbq_m_n_u8): Likewise. (vshllbq_m_n_u16): Likewise. (vshlltq_m_n_s8): Likewise. (vshlltq_m_n_s16): Likewise. (vshlltq_m_n_u8): Likewise. (vshlltq_m_n_u16): Likewise. (vshrnbq_m_n_s32): Likewise. (vshrnbq_m_n_s16): Likewise. (vshrnbq_m_n_u32): Likewise. (vshrnbq_m_n_u16): Likewise. (vshrntq_m_n_s32): Likewise. (vshrntq_m_n_s16): Likewise. (vshrntq_m_n_u32
[PATCH][ARM][GCC][4/4x]: MVE intrinsics with quaternary operands.
Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vabdq_m_f32, vabdq_m_f16, vaddq_m_f32, vaddq_m_f16, vaddq_m_n_f32, vaddq_m_n_f16, vandq_m_f32, vandq_m_f16, vbicq_m_f32, vbicq_m_f16, vbrsrq_m_n_f32, vbrsrq_m_n_f16, vcaddq_rot270_m_f32, vcaddq_rot270_m_f16, vcaddq_rot90_m_f32, vcaddq_rot90_m_f16, vcmlaq_m_f32, vcmlaq_m_f16, vcmlaq_rot180_m_f32, vcmlaq_rot180_m_f16, vcmlaq_rot270_m_f32, vcmlaq_rot270_m_f16, vcmlaq_rot90_m_f32, vcmlaq_rot90_m_f16, vcmulq_m_f32, vcmulq_m_f16, vcmulq_rot180_m_f32, vcmulq_rot180_m_f16, vcmulq_rot270_m_f32, vcmulq_rot270_m_f16, vcmulq_rot90_m_f32, vcmulq_rot90_m_f16, vcvtq_m_n_s32_f32, vcvtq_m_n_s16_f16, vcvtq_m_n_u32_f32, vcvtq_m_n_u16_f16, veorq_m_f32, veorq_m_f16, vfmaq_m_f32, vfmaq_m_f16, vfmaq_m_n_f32, vfmaq_m_n_f16, vfmasq_m_n_f32, vfmasq_m_n_f16, vfmsq_m_f32, vfmsq_m_f16, vmaxnmq_m_f32, vmaxnmq_m_f16, vminnmq_m_f32, vminnmq_m_f16, vmulq_m_f32, vmulq_m_f16, vmulq_m_n_f32, vmulq_m_n_f16, vornq_m_f32, vornq_m_f16, vorrq_m_f32, vorrq_m_f16, vsubq_m_f32, vsubq_m_f16, vsubq_m_n_f32, vsubq_m_n_f16. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-31 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vabdq_m_f32): Define macro. (vabdq_m_f16): Likewise. (vaddq_m_f32): Likewise. (vaddq_m_f16): Likewise. (vaddq_m_n_f32): Likewise. (vaddq_m_n_f16): Likewise. (vandq_m_f32): Likewise. (vandq_m_f16): Likewise. (vbicq_m_f32): Likewise. (vbicq_m_f16): Likewise. (vbrsrq_m_n_f32): Likewise. (vbrsrq_m_n_f16): Likewise. (vcaddq_rot270_m_f32): Likewise. (vcaddq_rot270_m_f16): Likewise. (vcaddq_rot90_m_f32): Likewise. (vcaddq_rot90_m_f16): Likewise. (vcmlaq_m_f32): Likewise. (vcmlaq_m_f16): Likewise. (vcmlaq_rot180_m_f32): Likewise. (vcmlaq_rot180_m_f16): Likewise. (vcmlaq_rot270_m_f32): Likewise. (vcmlaq_rot270_m_f16): Likewise. (vcmlaq_rot90_m_f32): Likewise. (vcmlaq_rot90_m_f16): Likewise. (vcmulq_m_f32): Likewise. (vcmulq_m_f16): Likewise. (vcmulq_rot180_m_f32): Likewise. (vcmulq_rot180_m_f16): Likewise. (vcmulq_rot270_m_f32): Likewise. (vcmulq_rot270_m_f16): Likewise. (vcmulq_rot90_m_f32): Likewise. (vcmulq_rot90_m_f16): Likewise. (vcvtq_m_n_s32_f32): Likewise. (vcvtq_m_n_s16_f16): Likewise. (vcvtq_m_n_u32_f32): Likewise. (vcvtq_m_n_u16_f16): Likewise. (veorq_m_f32): Likewise. (veorq_m_f16): Likewise. (vfmaq_m_f32): Likewise. (vfmaq_m_f16): Likewise. (vfmaq_m_n_f32): Likewise. (vfmaq_m_n_f16): Likewise. (vfmasq_m_n_f32): Likewise. (vfmasq_m_n_f16): Likewise. (vfmsq_m_f32): Likewise. (vfmsq_m_f16): Likewise. (vmaxnmq_m_f32): Likewise. (vmaxnmq_m_f16): Likewise. (vminnmq_m_f32): Likewise. (vminnmq_m_f16): Likewise. (vmulq_m_f32): Likewise. (vmulq_m_f16): Likewise. (vmulq_m_n_f32): Likewise. (vmulq_m_n_f16): Likewise. (vornq_m_f32): Likewise. (vornq_m_f16): Likewise. (vorrq_m_f32): Likewise. (vorrq_m_f16): Likewise. (vsubq_m_f32): Likewise. (vsubq_m_f16): Likewise. (vsubq_m_n_f32): Likewise. (vsubq_m_n_f16): Likewise. (__attribute__): Likewise. (__arm_vabdq_m_f32): Likewise. (__arm_vabdq_m_f16): Likewise. (__arm_vaddq_m_f32): Likewise. (__arm_vaddq_m_f16): Likewise. (__arm_vaddq_m_n_f32): Likewise. (__arm_vaddq_m_n_f16): Likewise. (__arm_vandq_m_f32): Likewise. (__arm_vandq_m_f16): Likewise. (__arm_vbicq_m_f32): Likewise. (__arm_vbicq_m_f16): Likewise. (__arm_vbrsrq_m_n_f32): Likewise. (__arm_vbrsrq_m_n_f16): Likewise. (__arm_vcaddq_rot270_m_f32): Likewise. (__arm_vcaddq_rot270_m_f16): Likewise. (__arm_vcaddq_rot90_m_f32): Likewise. (__arm_vcaddq_rot90_m_f16): Likewise. (__arm_vcmlaq_m_f32): Likewise. (__arm_vcmlaq_m_f16): Likewise. (__arm_vcmlaq_rot180_m_f32): Likewise. (__arm_vcmlaq_rot180_m_f16): Likewise. (__arm_vcmlaq_rot270_m_f32): Likewise. (__arm_vcmlaq_rot270_m_f16): Likewise. (__arm_vcmlaq_rot90_m_f32): Likewise. (__arm_vcmlaq_rot90_m_f16): Likewise. (__arm_vcmulq_m_f32): Likewise. (__arm_vcmulq_m_f16): Likewise. (__arm_vcmulq_rot180_m_f32): Define intrinsic. (__arm_vcmulq_rot180_m_f16): Likewise
[PATCH][ARM][GCC][1/3x]: MVE intrinsics with ternary operands.
Hello, This patch supports following MVE ACLE intrinsics with ternary operands. vabavq_s8, vabavq_s16, vabavq_s32, vbicq_m_n_s16, vbicq_m_n_s32, vbicq_m_n_u16, vbicq_m_n_u32, vcmpeqq_m_f16, vcmpeqq_m_f32, vcvtaq_m_s16_f16, vcvtaq_m_u16_f16, vcvtaq_m_s32_f32, vcvtaq_m_u32_f32, vcvtq_m_f16_s16, vcvtq_m_f16_u16, vcvtq_m_f32_s32, vcvtq_m_f32_u32, vqrshrnbq_n_s16, vqrshrnbq_n_u16, vqrshrnbq_n_s32, vqrshrnbq_n_u32, vqrshrunbq_n_s16, vqrshrunbq_n_s32, vrmlaldavhaq_s32, vrmlaldavhaq_u32, vshlcq_s8, vshlcq_u8, vshlcq_s16, vshlcq_u16, vshlcq_s32, vshlcq_u32, vabavq_s8, vabavq_s16, vabavq_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-23 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Define qualifier for ternary operands. (TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_NONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vabavq_s8): Define macro. (vabavq_s16): Likewise. (vabavq_s32): Likewise. (vbicq_m_n_s16): Likewise. (vbicq_m_n_s32): Likewise. (vbicq_m_n_u16): Likewise. (vbicq_m_n_u32): Likewise. (vcmpeqq_m_f16): Likewise. (vcmpeqq_m_f32): Likewise. (vcvtaq_m_s16_f16): Likewise. (vcvtaq_m_u16_f16): Likewise. (vcvtaq_m_s32_f32): Likewise. (vcvtaq_m_u32_f32): Likewise. (vcvtq_m_f16_s16): Likewise. (vcvtq_m_f16_u16): Likewise. (vcvtq_m_f32_s32): Likewise. (vcvtq_m_f32_u32): Likewise. (vqrshrnbq_n_s16): Likewise. (vqrshrnbq_n_u16): Likewise. (vqrshrnbq_n_s32): Likewise. (vqrshrnbq_n_u32): Likewise. (vqrshrunbq_n_s16): Likewise. (vqrshrunbq_n_s32): Likewise. (vrmlaldavhaq_s32): Likewise. (vrmlaldavhaq_u32): Likewise. (vshlcq_s8): Likewise. (vshlcq_u8): Likewise. (vshlcq_s16): Likewise. (vshlcq_u16): Likewise. (vshlcq_s32): Likewise. (vshlcq_u32): Likewise. (vabavq_u8): Likewise. (vabavq_u16): Likewise. (vabavq_u32): Likewise. (__arm_vabavq_s8): Define intrinsic. (__arm_vabavq_s16): Likewise. (__arm_vabavq_s32): Likewise. (__arm_vabavq_u8): Likewise. (__arm_vabavq_u16): Likewise. (__arm_vabavq_u32): Likewise. (__arm_vbicq_m_n_s16): Likewise. (__arm_vbicq_m_n_s32): Likewise. (__arm_vbicq_m_n_u16): Likewise. (__arm_vbicq_m_n_u32): Likewise. (__arm_vqrshrnbq_n_s16): Likewise. (__arm_vqrshrnbq_n_u16): Likewise. (__arm_vqrshrnbq_n_s32): Likewise. (__arm_vqrshrnbq_n_u32): Likewise. (__arm_vqrshrunbq_n_s16): Likewise. (__arm_vqrshrunbq_n_s32): Likewise. (__arm_vrmlaldavhaq_s32): Likewise. (__arm_vrmlaldavhaq_u32): Likewise. (__arm_vshlcq_s8): Likewise. (__arm_vshlcq_u8): Likewise. (__arm_vshlcq_s16): Likewise. (__arm_vshlcq_u16): Likewise. (__arm_vshlcq_s32): Likewise. (__arm_vshlcq_u32): Likewise. (__arm_vcmpeqq_m_f16): Likewise. (__arm_vcmpeqq_m_f32): Likewise. (__arm_vcvtaq_m_s16_f16): Likewise. (__arm_vcvtaq_m_u16_f16): Likewise. (__arm_vcvtaq_m_s32_f32): Likewise. (__arm_vcvtaq_m_u32_f32): Likewise. (__arm_vcvtq_m_f16_s16): Likewise. (__arm_vcvtq_m_f16_u16): Likewise. (__arm_vcvtq_m_f32_s32): Likewise. (__arm_vcvtq_m_f32_u32): Likewise. (vcvtaq_m): Define polymorphic variant. (vcvtq_m): Likewise. (vabavq): Likewise. (vshlcq): Likewise. (vbicq_m_n): Likewise. (vqrshrnbq_n): Likewise. (vqrshrunbq_n): Likewise. * config/arm/arm_mve_builtins.def (TERNOP_UNONE_UNONE_UNONE_IMM_QUALIFIERS): Use the builtin qualifer. (TERNOP_UNONE_UNONE_NONE_NONE_QUALIFIERS): Likewise. (TERNOP_UNONE_NONE_UNONE_IMM_QUALIFIERS): Likewise
[PATCH][ARM][GCC][2/4x]: MVE intrinsics with quaternary operands.
details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-30 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vabdq_m_s8): Define macro. (vabdq_m_s32): Likewise. (vabdq_m_s16): Likewise. (vabdq_m_u8): Likewise. (vabdq_m_u32): Likewise. (vabdq_m_u16): Likewise. (vaddq_m_n_s8): Likewise. (vaddq_m_n_s32): Likewise. (vaddq_m_n_s16): Likewise. (vaddq_m_n_u8): Likewise. (vaddq_m_n_u32): Likewise. (vaddq_m_n_u16): Likewise. (vaddq_m_s8): Likewise. (vaddq_m_s32): Likewise. (vaddq_m_s16): Likewise. (vaddq_m_u8): Likewise. (vaddq_m_u32): Likewise. (vaddq_m_u16): Likewise. (vandq_m_s8): Likewise. (vandq_m_s32): Likewise. (vandq_m_s16): Likewise. (vandq_m_u8): Likewise. (vandq_m_u32): Likewise. (vandq_m_u16): Likewise. (vbicq_m_s8): Likewise. (vbicq_m_s32): Likewise. (vbicq_m_s16): Likewise. (vbicq_m_u8): Likewise. (vbicq_m_u32): Likewise. (vbicq_m_u16): Likewise. (vbrsrq_m_n_s8): Likewise. (vbrsrq_m_n_s32): Likewise. (vbrsrq_m_n_s16): Likewise. (vbrsrq_m_n_u8): Likewise. (vbrsrq_m_n_u32): Likewise. (vbrsrq_m_n_u16): Likewise. (vcaddq_rot270_m_s8): Likewise. (vcaddq_rot270_m_s32): Likewise. (vcaddq_rot270_m_s16): Likewise. (vcaddq_rot270_m_u8): Likewise. (vcaddq_rot270_m_u32): Likewise. (vcaddq_rot270_m_u16): Likewise. (vcaddq_rot90_m_s8): Likewise. (vcaddq_rot90_m_s32): Likewise. (vcaddq_rot90_m_s16): Likewise. (vcaddq_rot90_m_u8): Likewise. (vcaddq_rot90_m_u32): Likewise. (vcaddq_rot90_m_u16): Likewise. (veorq_m_s8): Likewise. (veorq_m_s32): Likewise. (veorq_m_s16): Likewise. (veorq_m_u8): Likewise. (veorq_m_u32): Likewise. (veorq_m_u16): Likewise. (vhaddq_m_n_s8): Likewise. (vhaddq_m_n_s32): Likewise. (vhaddq_m_n_s16): Likewise. (vhaddq_m_n_u8): Likewise. (vhaddq_m_n_u32): Likewise. (vhaddq_m_n_u16): Likewise. (vhaddq_m_s8): Likewise. (vhaddq_m_s32): Likewise. (vhaddq_m_s16): Likewise. (vhaddq_m_u8): Likewise. (vhaddq_m_u32): Likewise. (vhaddq_m_u16): Likewise. (vhcaddq_rot270_m_s8): Likewise. (vhcaddq_rot270_m_s32): Likewise. (vhcaddq_rot270_m_s16): Likewise. (vhcaddq_rot90_m_s8): Likewise. (vhcaddq_rot90_m_s32): Likewise. (vhcaddq_rot90_m_s16): Likewise. (vhsubq_m_n_s8): Likewise. (vhsubq_m_n_s32): Likewise. (vhsubq_m_n_s16): Likewise. (vhsubq_m_n_u8): Likewise. (vhsubq_m_n_u32): Likewise. (vhsubq_m_n_u16): Likewise. (vhsubq_m_s8): Likewise. (vhsubq_m_s32): Likewise. (vhsubq_m_s16): Likewise. (vhsubq_m_u8): Likewise. (vhsubq_m_u32): Likewise. (vhsubq_m_u16): Likewise. (vmaxq_m_s8): Likewise. (vmaxq_m_s32): Likewise. (vmaxq_m_s16): Likewise. (vmaxq_m_u8): Likewise. (vmaxq_m_u32): Likewise. (vmaxq_m_u16): Likewise. (vminq_m_s8): Likewise. (vminq_m_s32): Likewise. (vminq_m_s16): Likewise. (vminq_m_u8): Likewise. (vminq_m_u32): Likewise. (vminq_m_u16): Likewise. (vmladavaq_p_s8): Likewise. (vmladavaq_p_s32): Likewise. (vmladavaq_p_s16): Likewise. (vmladavaq_p_u8): Likewise. (vmladavaq_p_u32): Likewise. (vmladavaq_p_u16): Likewise. (vmladavaxq_p_s8): Likewise. (vmladavaxq_p_s32): Likewise. (vmladavaxq_p_s16): Likewise. (vmlaq_m_n_s8): Likewise. (vmlaq_m_n_s32): Likewise. (vmlaq_m_n_s16): Likewise. (vmlaq_m_n_u8): Likewise. (vmlaq_m_n_u32): Likewise. (vmlaq_m_n_u16): Likewise. (vmlasq_m_n_s8): Likewise. (vmlasq_m_n_s32): Likewise. (vmlasq_m_n_s16): Likewise. (vmlasq_m_n_u8): Likewise. (vmlasq_m_n_u32): Likewise. (vmlasq_m_n_u16): Likewise. (vmlsdavaq_p_s8): Likewise. (vmlsdavaq_p_s32): Likewise. (vmlsdavaq_p_s16): Likewise. (vmlsdavaxq_p_s8): Likewise. (vmlsdavaxq_p_s32): Likewise. (vmlsdavaxq_p_s16): Likewise. (vmulhq_m_s8): Likewise. (vmulhq_m_s32): Likewise. (vmulhq_m_s16): Likewise. (vmulhq_m_u8): Likewise. (vmulhq_m_u32): Likewise. (vmulhq_m_u16): Likewise. (vmullbq_int_m_s8): Likewise. (vmullbq_int_m_s32): Likewise. (vmullbq_int_m_s16): Likewise
[PATCH][ARM][GCC][1/4x]: MVE intrinsics with quaternary operands.
Hello, This patch supports following MVE ACLE intrinsics with quaternary operands. vsriq_m_n_s8, vsubq_m_s8, vsubq_x_s8, vcvtq_m_n_f16_u16, vcvtq_x_n_f16_u16, vqshluq_m_n_s8, vabavq_p_s8, vsriq_m_n_u8, vshlq_m_u8, vshlq_x_u8, vsubq_m_u8, vsubq_x_u8, vabavq_p_u8, vshlq_m_s8, vshlq_x_s8, vcvtq_m_n_f16_s16, vcvtq_x_n_f16_s16, vsriq_m_n_s16, vsubq_m_s16, vsubq_x_s16, vcvtq_m_n_f32_u32, vcvtq_x_n_f32_u32, vqshluq_m_n_s16, vabavq_p_s16, vsriq_m_n_u16, vshlq_m_u16, vshlq_x_u16, vsubq_m_u16, vsubq_x_u16, vabavq_p_u16, vshlq_m_s16, vshlq_x_s16, vcvtq_m_n_f32_s32, vcvtq_x_n_f32_s32, vsriq_m_n_s32, vsubq_m_s32, vsubq_x_s32, vqshluq_m_n_s32, vabavq_p_s32, vsriq_m_n_u32, vshlq_m_u32, vshlq_x_u32, vsubq_m_u32, vsubq_x_u32, vabavq_p_u32, vshlq_m_s32, vshlq_x_s32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-10-29 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm-builtins.c (QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Define builtin qualifier. (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. * config/arm/arm_mve.h (vsriq_m_n_s8): Define macro. (vsubq_m_s8): Likewise. (vcvtq_m_n_f16_u16): Likewise. (vqshluq_m_n_s8): Likewise. (vabavq_p_s8): Likewise. (vsriq_m_n_u8): Likewise. (vshlq_m_u8): Likewise. (vsubq_m_u8): Likewise. (vabavq_p_u8): Likewise. (vshlq_m_s8): Likewise. (vcvtq_m_n_f16_s16): Likewise. (vsriq_m_n_s16): Likewise. (vsubq_m_s16): Likewise. (vcvtq_m_n_f32_u32): Likewise. (vqshluq_m_n_s16): Likewise. (vabavq_p_s16): Likewise. (vsriq_m_n_u16): Likewise. (vshlq_m_u16): Likewise. (vsubq_m_u16): Likewise. (vabavq_p_u16): Likewise. (vshlq_m_s16): Likewise. (vcvtq_m_n_f32_s32): Likewise. (vsriq_m_n_s32): Likewise. (vsubq_m_s32): Likewise. (vqshluq_m_n_s32): Likewise. (vabavq_p_s32): Likewise. (vsriq_m_n_u32): Likewise. (vshlq_m_u32): Likewise. (vsubq_m_u32): Likewise. (vabavq_p_u32): Likewise. (vshlq_m_s32): Likewise. (__arm_vsriq_m_n_s8): Define intrinsic. (__arm_vsubq_m_s8): Likewise. (__arm_vqshluq_m_n_s8): Likewise. (__arm_vabavq_p_s8): Likewise. (__arm_vsriq_m_n_u8): Likewise. (__arm_vshlq_m_u8): Likewise. (__arm_vsubq_m_u8): Likewise. (__arm_vabavq_p_u8): Likewise. (__arm_vshlq_m_s8): Likewise. (__arm_vsriq_m_n_s16): Likewise. (__arm_vsubq_m_s16): Likewise. (__arm_vqshluq_m_n_s16): Likewise. (__arm_vabavq_p_s16): Likewise. (__arm_vsriq_m_n_u16): Likewise. (__arm_vshlq_m_u16): Likewise. (__arm_vsubq_m_u16): Likewise. (__arm_vabavq_p_u16): Likewise. (__arm_vshlq_m_s16): Likewise. (__arm_vsriq_m_n_s32): Likewise. (__arm_vsubq_m_s32): Likewise. (__arm_vqshluq_m_n_s32): Likewise. (__arm_vabavq_p_s32): Likewise. (__arm_vsriq_m_n_u32): Likewise. (__arm_vshlq_m_u32): Likewise. (__arm_vsubq_m_u32): Likewise. (__arm_vabavq_p_u32): Likewise. (__arm_vshlq_m_s32): Likewise. (__arm_vcvtq_m_n_f16_u16): Likewise. (__arm_vcvtq_m_n_f16_s16): Likewise. (__arm_vcvtq_m_n_f32_u32): Likewise. (__arm_vcvtq_m_n_f32_s32): Likewise. (vcvtq_m_n): Define polymorphic variant. (vqshluq_m_n): Likewise. (vshlq_m): Likewise. (vsriq_m_n): Likewise. (vsubq_m): Likewise. (vabavq_p): Likewise. * config/arm/arm_mve_builtins.def (QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Use builtin qualifier. (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Likewise. (QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Likewise. * config/arm/mve.md (VABAVQ_P): Define iterator. (VSHLQ_M): Likewise. (VSRIQ_M_N): Likewise