Hi Kyrill, I have re-based this patch, please commit the following patch on my behalf. https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541826.html
Regards, SRI. ________________________________ From: Gcc-patches <gcc-patches-boun...@gcc.gnu.org> on behalf of Srinath Parvathaneni <srinath.parvathan...@arm.com> Sent: 16 March 2020 10:54 To: Kyrill Tkachov <kyrylo.tkac...@foss.arm.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org> Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. Hi Kyrill, > Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues. There is problem with git commit rights, could you commit this patch on my behalf. Regards SRI. ________________________________ From: Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> Sent: 12 March 2020 11:16 To: Srinath Parvathaneni <srinath.parvathan...@arm.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org> Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch. Hi Srinath, On 3/10/20 6:19 PM, Srinath Parvathaneni wrote: > Hello Kyrill, > > This patch addresses all the comments in patch version v2. > (version v2) > https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html > > #### > > > Hello, > > This patch is part of MVE ACLE intrinsics framework. > This patches add support to update (read/write) the APSR (Application > Program Status Register) > register and FPSCR (Floating-point Status and Control Register) > register for MVE. > This patch also enables thumb2 mov RTL patterns for MVE. > > A new feature bit vfp_base is added. This bit is enabled for all VFP, > MVE and MVE with floating point > extensions. This bit is used to enable the macro TARGET_VFP_BASE. For > all the VFP instructions, RTL patterns, > status and control registers are guarded by TARGET_HAVE_FLOAT. But > this patch modifies that and the > common instructions, RTL patterns, status and control registers > bewteen MVE and VFP are guarded by > TARGET_VFP_BASE macro. > > The RTL pattern set_fpscr and get_fpscr are updated to use > VFPCC_REGNUM because few MVE intrinsics > set/get carry bit of FPSCR register. > > Please refer to Arm reference manual [1] for more details. > [1] https://developer.arm.com/docs/ddi0553/latest > > Regression tested on target arm-none-eabi and armeb-none-eabi and > found no regressions. > > Ok for trunk? Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the other patches in this series) Thanks, Kyrill > > Thanks, > Srinath > gcc/ChangeLog: > > 2020-03-06 Andre Vieira <andre.simoesdiasvie...@arm.com> > Mihail Ionescu <mihail.ione...@arm.com> > Srinath Parvathaneni <srinath.parvathan...@arm.com> > > * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When > vfp_base > feature bit is on and -mfpu=auto is passed as compiler option, > do not > generate error on not finding any match fpu. Because in this > case fpu > is not required. > * config/arm/arm-cpus.in (vfp_base): Define feature bit, this > bit is > enabled for MVE and also for all VFP extensions. > (VFPv2): Modify fgroup to enable vfp_base feature bit when > ever VFPv2 > is enabled. > (MVE): Define fgroup to enable feature bits mve, vfp_base and > armv7em. > (MVE_FP): Define fgroup to enable feature bits is fgroup MVE > and FPv5 > along with feature bits mve_float. > (mve): Modify add options in armv8.1-m.main arch for MVE. > (mve.fp): Modify add options in armv8.1-m.main arch for MVE with > floating point. > * config/arm/arm.c (use_return_insn): Replace the > check with TARGET_VFP_BASE. > (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE. > (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with TARGET_VFP_BASE, to allow cost calculations for copies in > MVE as > well. > (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with > TARGET_VFP_BASE, to allow space calculation for VFP registers > in MVE > as well. > (arm_compute_frame_layout): Likewise. > (arm_save_coproc_regs): Likewise. > (arm_fixed_condition_code_regs): Modify to enable using > VFPCC_REGNUM > in MVE as well. > (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || > TARGET_HAVE_MVE" > with equivalent macro TARGET_VFP_BASE. > (arm_expand_epilogue_apcs_frame): Likewise. > (arm_expand_epilogue): Likewise. > (arm_conditional_register_usage): Likewise. > (arm_declare_function_name): Add check to skip printing .fpu > directive > in assembly file when TARGET_VFP_BASE is enabled and > fpu_to_print is > "softvfp". > * config/arm/arm.h (TARGET_VFP_BASE): Define. > * config/arm/arm.md (arch): Add "mve" to arch. > (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true. > (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT > || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE. > * config/arm/constraints.md (Uf): Define to allow modification > to FPCCR > in MVE. > * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify > target guard > to not allow for MVE. > * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile > unspecs > enum. > (VUNSPEC_GET_FPSCR): Define. > * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR > and VMRS > instructions which move to general-purpose Register from > Floating-point > Special register and vice-versa. > (thumb2_movhi_fp16): Likewise. > (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions > along > with MCR and MRC instructions which set and get Floating-point > Status > and Control Register (FPSCR). > (movdi_vfp): Modify pattern to enable Single-precision scalar > float move > in MVE. > (thumb2_movdf_vfp): Modify pattern to enable Double-precision > scalar > float move patterns in MVE. > (thumb2_movsfcc_vfp): Modify pattern to enable single float > conditional > code move patterns of VFP also in MVE by adding > TARGET_VFP_BASE check. > (thumb2_movdfcc_vfp): Modify pattern to enable double float > conditional > code move patterns of VFP also in MVE by adding > TARGET_VFP_BASE check. > (push_multi_vfp): Add support to use VFP VPUSH pattern for MVE > by adding > TARGET_VFP_BASE check. > (set_fpscr): Add support to set FPSCR register for MVE. Modify > pattern > using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR > register. > (get_fpscr): Add support to get FPSCR register for MVE. Modify > pattern > using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR > register. > > gcc/testsuite/ChangeLog: > > 2020-03-06 Srinath Parvathaneni <srinath.parvathan...@arm.com> > > * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test. > * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise. > * gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise. > * gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise. > * gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise. > > > ############### Attachment also inlined for ease of reply > ############### > > > diff --git a/gcc/common/config/arm/arm-common.c > b/gcc/common/config/arm/arm-common.c > index > 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1 > 100644 > --- a/gcc/common/config/arm/arm-common.c > +++ b/gcc/common/config/arm/arm-common.c > @@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv) > } > } > > - gcc_assert (i != TARGET_FPU_auto); > + gcc_assert (i != TARGET_FPU_auto > + || bitmap_bit_p (arm_active_target.isa, > isa_bit_vfp_base)); > } > > auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu=")); > diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in > index > 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1 > 100644 > --- a/gcc/config/arm/arm-cpus.in > +++ b/gcc/config/arm/arm-cpus.in > @@ -135,6 +135,10 @@ define feature armv8_1m_main > # Floating point and Neon extensions. > # VFPv1 is not supported in GCC. > > +# This feature bit is enabled for all VFP, MVE and > +# MVE with floating point extensions. > +define feature vfp_base > + > # Vector floating point v2. > define feature vfpv2 > > @@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL > ALL_SIMD_EXTERNAL > > # List of all FPU bits to strip out if -mfpu is used to override the > # default. fp16 is deliberately missing from this list. > -define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl > ALL_SIMD_INTERNAL > +define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5 > fp16conv fp_dbl ALL_SIMD_INTERNAL > # Similarly, but including fp16 and other extensions that aren't part of > # -mfpu support. > define fgroup ALL_FPU_EXTERNAL fp16 bf16 > @@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a > define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main > > # Useful combinations. > -define fgroup VFPv2 vfpv2 > +define fgroup VFPv2 vfp_base vfpv2 > define fgroup VFPv3 VFPv2 vfpv3 > define fgroup VFPv4 VFPv3 vfpv4 fp16conv > define fgroup FPv5 VFPv4 fpv5 > +define fgroup MVE mve vfp_base armv7em > +define fgroup MVE_FP MVE FPv5 fp16 mve_float > > define fgroup FP_DBL fp_dbl > define fgroup FP_D32 FP_DBL fp_d32 > @@ -699,8 +705,8 @@ begin arch armv8.1-m.main > option fp add FPv5 fp16 > option fp.dp add FPv5 FP_DBL fp16 > option nofp remove ALL_FP > - option mve add mve armv7em > - option mve.fp add mve FPv5 fp16 mve_float armv7em > + option mve add MVE > + option mve.fp add MVE_FP > end arch armv8.1-m.main > > begin arch iwmmxt > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h > index > a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163 > 100644 > --- a/gcc/config/arm/arm.h > +++ b/gcc/config/arm/arm.h > @@ -334,6 +334,19 @@ emission of floating point pcs attributes. */ > isa_bit_mve_float) \ > && !TARGET_GENERAL_REGS_ONLY) > > +/* MVE have few common instructions as VFP, like VLDM alias VPOP, > VLDR, VSTM > + alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it > updates few > + registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and > MVFR2. All > + the VFP instructions, RTL patterns and register are guarded by > + TARGET_HARD_FLOAT. But the common instructions, RTL pattern and > registers > + between MVE and VFP will be guarded by the following macro > TARGET_VFP_BASE > + hereafter. */ > + > +#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \ > + && bitmap_bit_p (arm_active_target.isa, \ > + isa_bit_vfp_base) \ > + && !TARGET_GENERAL_REGS_ONLY) > + > /* Nonzero if integer division instructions supported. */ > #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \ > || (TARGET_THUMB && arm_arch_thumb_hwdiv)) > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index > c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd > 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling) > > /* Can't be done if any of the VFP regs are pushed, > since this also requires an insn. */ > - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + if (TARGET_VFP_BASE) > for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++) > if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p > (regno)) > return 0; > @@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool > is_double) > return false; > > return (TARGET_32BIT && TARGET_HARD_FLOAT && > - (TARGET_VFP_DOUBLE || !is_double)); > + (TARGET_VFP_DOUBLE || !is_double)); > } > > /* Return true if an argument whose type is TYPE, or mode is MODE, is > @@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, > rtx index, int strict_p) > > /* ??? Combine arm and thumb2 coprocessor addressing modes. */ > /* Standard coprocessor addressing modes. */ > - if (TARGET_HARD_FLOAT > + if (TARGET_VFP_BASE > && (mode == SFmode || mode == DFmode)) > return (code == CONST_INT && INTVAL (index) < 1024 > /* Thumb-2 allows only > -256 index range for it's core > register > @@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code > code, enum rtx_code outer_code, > /* Assume that most copies can be done with a single insn, > unless we don't have HW FP, in which case everything > larger than word mode will require two insns. */ > - *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE > && GET_MODE_SIZE (mode) > 4) > || mode == DImode) > ? 2 : 1); > @@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void) > > saved = 0; > /* Space for saved VFP registers. */ > - if (TARGET_HARD_FLOAT) > + if (TARGET_VFP_BASE) > { > count = 0; > for (regno = FIRST_VFP_REGNUM; > @@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void) > func_type = arm_current_func_type (); > /* Space for saved VFP registers. */ > if (! IS_VOLATILE (func_type) > - && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)) > + && TARGET_VFP_BASE) > saved += arm_get_vfp_saved_size (); > > /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M > Mainline > @@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void) > saved_size += 8; > } > > - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + if (TARGET_VFP_BASE) > { > start_reg = FIRST_VFP_REGNUM; > > @@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int > *p1, unsigned int *p2) > return false; > > *p1 = CC_REGNUM; > - *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM; > + *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM; > return true; > } > > @@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > { > if (GET_MODE_CLASS (mode) == MODE_CC) > return (regno == CC_REGNUM > - || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + || (TARGET_VFP_BASE > && regno == VFPCC_REGNUM)); > > if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC) > @@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, > machine_mode mode) > start of an even numbered register pair. */ > return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM); > > - if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno)) > + if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno)) > { > if (mode == DFmode) > return VFP_REGNO_OK_FOR_DOUBLE (regno); > @@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool > really_return) > floats_from_frame += 4; > } > > - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + if (TARGET_VFP_BASE) > { > int start_reg; > rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM); > @@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return) > } > } > > - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE) > + if (TARGET_VFP_BASE) > { > /* Generate VFP register multi-pop. */ > int end_reg = LAST_VFP_REGNUM + 1; > @@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void) > if (TARGET_THUMB1) > fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1; > > - if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)) > + if (TARGET_32BIT && TARGET_VFP_BASE) > { > /* VFPv3 registers are disabled when earlier VFP > versions are selected due to the definition of > @@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const > char *name, tree decl) > = TARGET_SOFT_FLOAT > ? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa); > > - if (fpu_to_print != arm_last_printed_arch_string) > + if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE) > + && (fpu_to_print != arm_last_printed_arch_string)) > { > asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ()); > arm_last_printed_fpu_string = fpu_to_print; > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md > index > 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc > 100644 > --- a/gcc/config/arm/arm.md > +++ b/gcc/config/arm/arm.md > @@ -134,7 +134,7 @@ > ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M > ; Baseline. This attribute is used to compute attribute "enabled", > ; use type "any" to enable an alternative in all cases. > -(define_attr "arch" > "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon" > +(define_attr "arch" > "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve" > (const_string "any")) > > (define_attr "arch_enabled" "no,yes" > @@ -188,6 +188,10 @@ > (and (eq_attr "arch" "neon") > (match_test "TARGET_NEON")) > (const_string "yes") > + > + (and (eq_attr "arch" "mve") > + (match_test "TARGET_HAVE_MVE")) > + (const_string "yes") > ] > > (const_string "no"))) > @@ -11758,7 +11762,7 @@ > (match_operand:SI 2 "const_int_I_operand" "I"))) > (set (match_operand:DF 3 "vfp_hard_register_operand" "") > (mem:DF (match_dup 1)))])] > - "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)" > + "TARGET_32BIT && TARGET_VFP_BASE" > "* > { > int num_regs = XVECLEN (operands[0], 0); > diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md > index > a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d > 100644 > --- a/gcc/config/arm/constraints.md > +++ b/gcc/config/arm/constraints.md > @@ -38,7 +38,7 @@ > ;; in all states: Pf, Pg > > ;; The following memory constraints have been used: > -;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up > +;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf > ;; in ARM state: Uq > ;; in Thumb state: Uu, Uw > ;; in all states: Q > @@ -46,6 +46,9 @@ > (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS" > "MVE VPR register") > > +(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS" > + "MVE FPCCR register") > + > (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS" > "The VFP registers @code{s0}-@code{s31}.") > > diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md > index > b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0 > 100644 > --- a/gcc/config/arm/thumb2.md > +++ b/gcc/config/arm/thumb2.md > @@ -517,7 +517,7 @@ > [(match_operand 4 "cc_register" "") > (const_int 0)]) > (match_operand:SF 1 "s_register_operand" "0,r") > (match_operand:SF 2 "s_register_operand" > "r,0")))] > - "TARGET_THUMB2 && TARGET_SOFT_FLOAT" > + "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE" > "@ > it\\t%D3\;mov%D3\\t%0, %2 > it\\t%d3\;mov%d3\\t%0, %1" > diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md > index > f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713 > 100644 > --- a/gcc/config/arm/unspecs.md > +++ b/gcc/config/arm/unspecs.md > @@ -170,6 +170,7 @@ > UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt > TORC instruction. > UNSPEC_TORVSC ; Used by the intrinsic form of the > iWMMXt TORVSC instruction. > UNSPEC_TEXTRC ; Used by the intrinsic form of the > iWMMXt TEXTRC instruction. > + UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. > ]) > > > @@ -216,7 +217,6 @@ > VUNSPEC_SLX ; Represent a store-register-release-exclusive. > VUNSPEC_LDA ; Represent a store-register-acquire. > VUNSPEC_STL ; Represent a store-register-release. > - VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. > VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. > VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. > VUNSPEC_CDP ; Represent the coprocessor cdp instruction. > diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md > index > ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1 > 100644 > --- a/gcc/config/arm/vfp.md > +++ b/gcc/config/arm/vfp.md > @@ -74,10 +74,10 @@ > (define_insn "*thumb2_movhi_vfp" > [(set > (match_operand:HI 0 "nonimmediate_operand" > - "=rk, r, l, r, m, r, *t, r, *t") > + "=rk, r, l, r, m, r, *t, r, *t, Up, r") > (match_operand:HI 1 "general_operand" > - "rk, I, Py, n, r, m, r, *t, *t"))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT > + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))] > + "TARGET_THUMB2 && TARGET_VFP_BASE > && !TARGET_VFP_FP16INST > && (register_operand (operands[0], HImode) > || register_operand (operands[1], HImode))" > @@ -99,20 +99,24 @@ > return "vmov%?\t%0, %1\t%@ int"; > case 8: > return "vmov%?.f32\t%0, %1\t%@ int"; > + case 9: > + return "vmsr%?\t P0, %1\t@ movhi"; > + case 10: > + return "vmrs%?\t %0, P0\t@ movhi"; > default: > gcc_unreachable (); > } > } > [(set_attr "predicable" "yes") > (set_attr "predicable_short_it" > - "yes, no, yes, no, no, no, no, no, no") > + "yes, no, yes, no, no, no, no, no, no, no, no") > (set_attr "type" > "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\ > - f_mcr, f_mrc, fmov") > - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *") > - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *") > - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *") > - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")] > + f_mcr, f_mrc, fmov, mve_move, mve_move") > + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve") > + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *") > + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *") > + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")] > ) > > ;; Patterns for HI moves which provide more data transfer > instructions when FP16 > @@ -170,10 +174,10 @@ > (define_insn "*thumb2_movhi_fp16" > [(set > (match_operand:HI 0 "nonimmediate_operand" > - "=rk, r, l, r, m, r, *t, r, *t") > + "=rk, r, l, r, m, r, *t, r, *t, Up, r") > (match_operand:HI 1 "general_operand" > - "rk, I, Py, n, r, m, r, *t, *t"))] > - "TARGET_THUMB2 && TARGET_VFP_FP16INST > + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))] > + "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE) > && (register_operand (operands[0], HImode) > || register_operand (operands[1], HImode))" > { > @@ -194,21 +198,25 @@ > return "vmov.f16\t%0, %1\t%@ int"; > case 8: > return "vmov%?.f32\t%0, %1\t%@ int"; > + case 9: > + return "vmsr%?\tP0, %1\t%@ movhi"; > + case 10: > + return "vmrs%?\t%0, P0\t%@ movhi"; > default: > gcc_unreachable (); > } > } > [(set_attr "predicable" > - "yes, yes, yes, yes, yes, yes, no, no, yes") > + "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes") > (set_attr "predicable_short_it" > - "yes, no, yes, no, no, no, no, no, no") > + "yes, no, yes, no, no, no, no, no, no, no, no") > (set_attr "type" > "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\ > - f_mcr, f_mrc, fmov") > - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *") > - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *") > - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *") > - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")] > + f_mcr, f_mrc, fmov, mve_move, mve_move") > + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve") > + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *") > + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *") > + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")] > ) > > ;; SImode moves > @@ -258,9 +266,11 @@ > ;; is chosen with length 2 when the instruction is predicated for > ;; arm_restrict_it. > (define_insn "*thumb2_movsi_vfp" > - [(set (match_operand:SI 0 "nonimmediate_operand" > "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv") > - (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r, > r,*t,*t,*UvTu,*t"))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT > + [(set (match_operand:SI 0 "nonimmediate_operand" > "=rk,r,l,r,r,l,*hk,m,*m,*t,\ > + r,*t,*t,*Uv, Up, r,Uf,r") > + (match_operand:SI 1 "general_operand" > "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\ > + *t,*UvTu,*t, r, Up,r,Uf"))] > + "TARGET_THUMB2 && TARGET_VFP_BASE > && ( s_register_operand (operands[0], SImode) > || s_register_operand (operands[1], SImode))" > "* > @@ -275,30 +285,44 @@ > case 4: > return \"movw%?\\t%0, %1\"; > case 5: > + case 6: > /* Cannot load it directly, split to load it via MOV / MOVT. */ > if (!MEM_P (operands[1]) && arm_disable_literal_pool) > return \"#\"; > return \"ldr%?\\t%0, %1\"; > - case 6: > - return \"str%?\\t%1, %0\"; > case 7: > - return \"vmov%?\\t%0, %1\\t%@ int\"; > case 8: > - return \"vmov%?\\t%0, %1\\t%@ int\"; > + return \"str%?\\t%1, %0\"; > case 9: > + return \"vmov%?\\t%0, %1\\t%@ int\"; > + case 10: > + return \"vmov%?\\t%0, %1\\t%@ int\"; > + case 11: > return \"vmov%?.f32\\t%0, %1\\t%@ int\"; > - case 10: case 11: > + case 12: case 13: > return output_move_vfp (operands); > + case 14: > + return \"vmsr\\t P0, %1\"; > + case 15: > + return \"vmrs\\t %0, P0\"; > + case 16: > + return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\"; > + case 17: > + return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\"; > default: > gcc_unreachable (); > } > " > [(set_attr "predicable" "yes") > - (set_attr "predicable_short_it" > "yes,no,yes,no,no,no,no,no,no,no,no,no") > - (set_attr "type" > "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores") > - (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4") > - (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*") > - (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")] > + (set_attr "predicable_short_it" > "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\ > + no,no,no,no,no") > + (set_attr "type" > "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\ > + store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\ > + mve_move,mrs,mrs") > + (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4") > + (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*") > + (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve") > + (set_attr "neg_pool_range" "*,*,*,*,*, 0, > 0,*,*,*,*,*,1008,*,*,*,*,*")] > ) > > > @@ -306,12 +330,12 @@ > > (define_insn "*movdi_vfp" > [(set (match_operand:DI 0 "nonimmediate_di_operand" > "=r,r,r,r,r,r,m,w,!r,w,w, Uv") > - (match_operand:DI 1 "di_operand" > "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))] > - "TARGET_32BIT && TARGET_HARD_FLOAT > + (match_operand:DI 1 "di_operand" > "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))] > + "TARGET_32BIT && TARGET_VFP_BASE > && ( register_operand (operands[0], DImode) > || register_operand (operands[1], DImode)) > - && !(TARGET_NEON && CONST_INT_P (operands[1]) > - && simd_immediate_valid_for_move (operands[1], DImode, NULL, > NULL))" > + && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1]) > + && simd_immediate_valid_for_move (operands[1], DImode, NULL, > NULL))" > "* > switch (which_alternative) > { > @@ -333,7 +357,7 @@ > case 8: > return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\"; > case 9: > - if (TARGET_VFP_SINGLE) > + if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE) > return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, > %p1\\t%@ int\"; > else > return \"vmov%?.f64\\t%P0, %P1\\t%@ int\"; > @@ -390,9 +414,15 @@ > case 6: /* S register from immediate. */ > return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\"; > case 7: /* S register from memory. */ > - return \"vld1.16\\t{%z0}, %A1\"; > + if (TARGET_HAVE_MVE) > + return \"vldr.16\\t%0, %A1\"; > + else > + return \"vld1.16\\t{%z0}, %A1\"; > case 8: /* Memory from S register. */ > - return \"vst1.16\\t{%z1}, %A0\"; > + if (TARGET_HAVE_MVE) > + return \"vstr.16\\t%1, %A0\"; > + else > + return \"vst1.16\\t{%z1}, %A0\"; > case 9: /* ARM register from constant. */ > { > long bits; > @@ -593,7 +623,7 @@ > (define_insn "*thumb2_movsf_vfp" > [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r > ,m,t,r") > (match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t, > mHa,r,t,r"))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT > + "TARGET_THUMB2 && TARGET_VFP_BASE > && ( s_register_operand (operands[0], SFmode) > || s_register_operand (operands[1], SFmode))" > "* > @@ -682,7 +712,7 @@ > (define_insn "*thumb2_movdf_vfp" > [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w > ,w,w ,Uv,r ,m,w,r") > (match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w, > mHa,r, w,r"))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT > + "TARGET_THUMB2 && TARGET_VFP_BASE > && ( register_operand (operands[0], DFmode) > || register_operand (operands[1], DFmode))" > "* > @@ -760,7 +790,7 @@ > [(match_operand 4 "cc_register" "") (const_int 0)]) > (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t") > (match_operand:SF 2 "s_register_operand" > "t,0,t,?r,0,?r,t,0,t")))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it" > + "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it" > "@ > it\\t%D3\;vmov%D3.f32\\t%0, %2 > it\\t%d3\;vmov%d3.f32\\t%0, %1 > @@ -806,7 +836,8 @@ > [(match_operand 4 "cc_register" "") (const_int 0)]) > (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w") > (match_operand:DF 2 "s_register_operand" > "w,0,w,?r,0,?r,w,0,w")))] > - "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && > !arm_restrict_it" > + "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE > + && !arm_restrict_it" > "@ > it\\t%D3\;vmov%D3.f64\\t%P0, %P2 > it\\t%d3\;vmov%d3.f64\\t%P0, %P1 > @@ -1977,7 +2008,7 @@ > [(set (match_operand:BLK 0 "memory_operand" "=m") > (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")] > UNSPEC_PUSH_MULT))])] > - "TARGET_32BIT && TARGET_HARD_FLOAT" > + "TARGET_32BIT && TARGET_VFP_BASE" > "* return vfp_output_vstmd (operands);" > [(set_attr "type" "f_stored")] > ) > @@ -2065,16 +2096,18 @@ > > ;; Write Floating-point Status and Control Register. > (define_insn "set_fpscr" > - [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] > VUNSPEC_SET_FPSCR)] > - "TARGET_HARD_FLOAT" > + [(set (reg:SI VFPCC_REGNUM) > + (unspec_volatile:SI > + [(match_operand:SI 0 "register_operand" "r")] > VUNSPEC_SET_FPSCR))] > + "TARGET_VFP_BASE" > "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR" > [(set_attr "type" "mrs")]) > > ;; Read Floating-point Status and Control Register. > (define_insn "get_fpscr" > [(set (match_operand:SI 0 "register_operand" "=r") > - (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))] > - "TARGET_HARD_FLOAT" > + (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))] > + "TARGET_VFP_BASE" > "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR" > [(set_attr "type" "mrs")]) > > diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */ > +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp > -mfloat-abi=hard -mthumb" } */ > + > +#include "arm_mve.h" > + > +int8x16_t > +foo1 (int8x16_t value) > +{ > + int8x16_t b = value; > + return b; > +} > + > +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */ > diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */ > +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp > -mfloat-abi=softfp -mthumb" } */ > + > +#include "arm_mve.h" > + > +int8x16_t > +foo1 (int8x16_t value) > +{ > + int8x16_t b = value; > + return b; > +} > + > +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */ > diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ > +/* { dg-additional-options "-march=armv8.1-m.main+mve > -mfloat-abi=hard -mthumb" } */ > + > +#include "arm_mve.h" > + > +int8x16_t > +foo1 (int8x16_t value) > +{ > + int8x16_t b = value; > + return b; > +} > + > +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */ > diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ > +/* { dg-additional-options "-march=armv8.1-m.main+mve > -mfloat-abi=softfp -mthumb" } */ > + > +#include "arm_mve.h" > + > +int8x16_t > +foo1 (int8x16_t value) > +{ > + int8x16_t b = value; > + return b; > +} > + > +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */ > diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c > b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c > new file mode 100644 > index > 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ > +/* { dg-additional-options "-march=armv8.1-m.main+mve > -mfloat-abi=soft -mthumb" } */ > + > +int > +foo1 (int value) > +{ > + int b = value; > + return b; > +} > + > +/* { dg-final { scan-assembler "\.fpu softvfp" } } */ >