Re: [PATCH 4/6] [ARC] Add peephole rules to combine store/loads into double store/loads
Thank you for your review. Please find attached a new respin patch with your feedback in. Please let me know if it is ok, Claudiu From 4ff7d8419783eceeffbaf27df017d0a93c3af942 Mon Sep 17 00:00:00 2001 From: Claudiu Zissulescu Date: Thu, 9 Aug 2018 14:29:05 +0300 Subject: [PATCH] [ARC] Add peephole rules to combine store/loads into double store/loads Simple peephole rules which combines multiple ld/st instructions into 64-bit load/store instructions. It only works for architectures which are having double load/store option on. gcc/ Claudiu Zissulescu * config/arc/arc-protos.h (gen_operands_ldd_std): Add. * config/arc/arc.c (operands_ok_ldd_std): New function. (mem_ok_for_ldd_std): Likewise. (gen_operands_ldd_std): Likewise. * config/arc/arc.md: Add peephole2 rules for std/ldd. --- gcc/config/arc/arc-protos.h | 1 + gcc/config/arc/arc.c| 161 gcc/config/arc/arc.md | 69 3 files changed, 231 insertions(+) diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h index 24bea6e1efb..55f8ed4c643 100644 --- a/gcc/config/arc/arc-protos.h +++ b/gcc/config/arc/arc-protos.h @@ -46,6 +46,7 @@ extern int arc_return_address_register (unsigned int); extern unsigned int arc_compute_function_type (struct function *); extern bool arc_is_uncached_mem_p (rtx); extern bool arc_lra_p (void); +extern bool gen_operands_ldd_std (rtx *operands, bool load, bool commute); #endif /* RTX_CODE */ extern unsigned int arc_compute_frame_size (int); diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c index 18dd0de6af7..daf785dbdb8 100644 --- a/gcc/config/arc/arc.c +++ b/gcc/config/arc/arc.c @@ -10803,6 +10803,167 @@ arc_cannot_substitute_mem_equiv_p (rtx) return true; } +/* Checks whether the operands are valid for use in an LDD/STD + instruction. Assumes that RT, and RT2 are REG. This is guaranteed + by the patterns. Assumes that the address in the base register RN + is word aligned. Pattern guarantees that both memory accesses use + the same base register, the offsets are constants within the range, + and the gap between the offsets is 4. If reload complete then + check that registers are legal. */ + +static bool +operands_ok_ldd_std (rtx rt, rtx rt2, HOST_WIDE_INT offset) +{ + unsigned int t, t2; + + if (!reload_completed) +return true; + + if (!(SMALL_INT_RANGE (offset, (GET_MODE_SIZE (DImode) - 1) & (~0x03), + (offset & (GET_MODE_SIZE (DImode) - 1) & 3 + ? 0 : -(-GET_MODE_SIZE (DImode) | (~0x03)) >> 1 +return false; + + t = REGNO (rt); + t2 = REGNO (rt2); + + if ((t2 == PROGRAM_COUNTER_REGNO) + || (t % 2 != 0) /* First destination register is not even. */ + || (t2 != t + 1)) + return false; + + return true; +} + +/* Helper for gen_operands_ldd_std. Returns true iff the memory + operand MEM's address contains an immediate offset from the base + register and has no side effects, in which case it sets BASE and + OFFSET accordingly. */ + +static bool +mem_ok_for_ldd_std (rtx mem, rtx *base, rtx *offset) +{ + rtx addr; + + gcc_assert (base != NULL && offset != NULL); + + /* TODO: Handle more general memory operand patterns, such as + PRE_DEC and PRE_INC. */ + + if (side_effects_p (mem)) +return false; + + /* Can't deal with subregs. */ + if (GET_CODE (mem) == SUBREG) +return false; + + gcc_assert (MEM_P (mem)); + + *offset = const0_rtx; + + addr = XEXP (mem, 0); + + /* If addr isn't valid for DImode, then we can't handle it. */ + if (!arc_legitimate_address_p (DImode, addr, +reload_in_progress || reload_completed)) +return false; + + if (REG_P (addr)) +{ + *base = addr; + return true; +} + else if (GET_CODE (addr) == PLUS || GET_CODE (addr) == MINUS) +{ + *base = XEXP (addr, 0); + *offset = XEXP (addr, 1); + return (REG_P (*base) && CONST_INT_P (*offset)); +} + + return false; +} + +/* Called from peephole2 to replace two word-size accesses with a + single LDD/STD instruction. Returns true iff we can generate a new + instruction sequence. That is, both accesses use the same base + register and the gap between constant offsets is 4. OPERANDS are + the operands found by the peephole matcher; OPERANDS[0,1] are + register operands, and OPERANDS[2,3] are the corresponding memory + operands. LOAD indicates whether the access is load or store. */ + +bool +gen_operands_ldd_std (rtx *operands, bool load, bool commute) +{ + int i, gap; + HOST_WIDE_INT offsets[2], offset; + int nops = 2; + rtx cur_base, cur_offset, tmp; + rtx base = NULL_RTX; + + /* Check that the memory references are immediate offsets from the + same base register. Extract the base register, the destination + registers, and the corresponding memory offsets. */ + for (i = 0; i < nops; i++) +{ + if (!mem_ok_for_ldd_std (operands[nops+i], &cur_base, &cur_offset)) + return false; + +
Re: [PATCH 6/6] [ARC] Handle store cacheline hazard.
> I'm not a fan of this approach. I'd rather the comment explain what > problem was found and patched, and why displaying a warning is not > appropriate. The commented out code just leaves me asking ... why? > Having the warning here breaks a number of builds, like the linux kernel build. On the other hand the users were curious if the locking sequence was common or not. I'll remove the commented warning for clarity and I will provide to the curious users a patch to get the warning back on for their needs. Thanks, Claudiu
Re: [PATCH 3/6] [ARC] Add BI/BIH instruction support.
Thank you all for your review. The patch is pushed with your input in. //Claudiu
Re: [PATCH 3/6] [ARC] Add BI/BIH instruction support.
Committed with your feedback in. Thank you, Claudiu
Re: [PATCH 1/6] [ARC] Remove non standard funcions calls.
Thank you for your review. Patch pushed, Claudiu
Re: [PATCH 2/6] [ARC] Cleanup TLS implementation.
Committed with your feedback in. Thank you, Claudiu
Re: [PATCH 6/6] [ARC] Handle store cacheline hazard.
Committed with feedback in. Thank you, Claudiu
Re: [PATCH 4/6] [ARC] Add peephole rules to combine store/loads into double store/loads
PING. On Wed, 2018-10-31 at 10:33 +0200, claz...@gmail.com wrote: > Thank you for your review. Please find attached a new respin patch > with > your feedback in. > > Please let me know if it is ok, > Claudiu
[wwwdocs] [committed] Add ARC news
Hi, I've just committed the attached patch containing the news for the ARC backend. Thank you, Claudiu ? backends.html.~1.82.~ ? wwwdocs_arc.patch Index: backends.html === RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v retrieving revision 1.82 diff -r1.82 backends.html 73c73 < arc| B b gi --- > arc| B b gia Index: gcc-9/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v retrieving revision 1.26 diff -r1.26 changes.html 148c148,154 < --- > ARC > > LRA is now on by default for the ARC target. This can be > controlled by -mlra. > Add support for frame code-density and branch-and-index > instructions. >
Re: [PATCH] [ARC] Cleanup, fix and set LRA default.
Thank you all for your review. I have pushed the patch with the suggested mods. I also made a new patch (and pushed) for wwwdocs. Claudiu
Re: [PATCH 1/2] [ARC] Fix and refurbish the interrupts.
Hi Jeff, Please find attached the updated patch. What is new: - mailing list feedback is taken into account. - some comments are updated. - a new test is added. - the ARC AUX registers used by ZOL (hardware loop) and FPX (a custom floating point implementation) are saved before fp-register. - the millicode optimization is not used by ISR. Thank you, Claudiu From d22368681b7aab4bef4b5c32a9a472808f2c16cd Mon Sep 17 00:00:00 2001 From: Claudiu Zissulescu Date: Fri, 17 May 2019 14:48:17 +0300 Subject: [PATCH] [ARC] Fix and refurbish the interrupts. When entering an interrupt, not only the call save registers needs to be place on stack but also the call clobbers one. More over, the ARC700 return from interrupt instruction needs to be rtie, the same like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX] instruction. Additionally, we need to save the state of the ZOL machinery, namely the lp_count, lp_end and lp_start registers. For architectures which are using extension registers (i.e., HS48) we need to save/restore them as well. gcc/ -xx-xx Claudiu Zissulescu * config/arc/arc-protos.h (arc_output_function_epilogue): Delete declaration. (arc_compute_frame_size): Millicode is disabled when compiling ISR. (arc_return_address_register): Likewise. (arc_compute_function_type): Likewise. (arc_compute_frame_size): Likewise. (secondary_reload_info): Likewise. (arc_get_unalign): Likewise. (arc_can_use_return_insn): Declare. * config/arc/arc.c (AUX_LP_START): Define (AUX_LP_END): Likewise. (arc_frame_info): Update gmask member to 64-bit datum. (GMASK_LEN): Update. (arc_compute_function_type): Make it static, move it forward. (arc_must_save_register): Update, consider the extra regs. (arc_compute_millicode_save_restore_regs): Update to use the 64 bit gmask. (arc_compute_frame_size): Likewise. (arc_enter_leave_p): Likewise. (arc_save_callee_saves): Likewise. (arc_restore_callee_saves): Likewise. (arc_save_callee_enter): Likewise. (arc_restore_callee_leave): Likewise. (arc_save_callee_milli): Likewise. (arc_restore_callee_milli): Likewise. (arc_expand_prologue): Add new interrupt handling. (arc_return_address_register): Make it static, move it forward. (arc_expand_epilogue): Add new interrupt handling. (arc_get_unalign): Delete. (arc_epilogue_uses): Make sure we do not remove the extra saved/restored registers when interrupt. (arc_can_use_return_insn): New function. (push_reg): Likewise. (pop_reg): Likewise. (arc_save_callee_saves): Add ZOL and FPX aux registers saving procedures. (arc_restore_callee_saves): Likewise, but restoring. * config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define. (R33_REG): Likewise. (R34_REG): Likewise. (R35_REG): Likewise. (R36_REG): Likewise. (R37_REG): Likewise. (R38_REG): Likewise. (R39_REG): Likewise. (R45_REG): Likewise. (R46_REG): Likewise. (R47_REG): Likewise. (R48_REG): Likewise. (R49_REG): Likewise. (R50_REG): Likewise. (R51_REG): Likewise. (R52_REG): Likewise. (R53_REG): Likewise. (R54_REG): Likewise. (R55_REG): Likewise. (R56_REG): Likewise. (R58_REG): Likewise. (type): Add rtie attribute. (in_call_delay_slot): Use RETURN_ADDR_REGNUM. (movsi_insn): Accept moves to lp_count. (rtie): Update pattern. (simple_return): Simplify it, don't use this pattern as a return from an interrupt. (arc600_rtie): New pattern. (p_return_i): Clean up. (return): Likewise. * config/arc/builtins.def (rtie): Only available for non ARC6xx family CPUs. * config/arc/predicates.md (move_src_operand): Consider lp_count as a register. gcc/testsuite -xx-xx Claudiu Zissulescu * gcc.target/arc/arc.exp (check_effective_target_accregs): New predicate. * gcc.target/arc/builtin_special.c: Update test/ * gcc.target/arc/interrupt-1.c: Likewise. * gcc.target/arc/interrupt-10.c: New test. * gcc.target/arc/interrupt-11.c: Likewise. * gcc.target/arc/interrupt-12.c: Likewise. --- gcc/config/arc/arc-protos.h | 7 +- gcc/config/arc/arc.c | 741 +++--- gcc/config/arc/arc.md | 139 ++-- gcc/config/arc/builtins.def | 2 +- gcc/config/arc/predicates.md | 2 + gcc/testsuite/gcc.target/arc/arc.exp | 18 + .../gcc.target/arc/builtin_special.c | 2 + gcc/testsuite/gcc.target/arc/interrupt-1.c| 4 +- gcc/testsuite/gcc.target/arc/interrupt-10.c | 36 + gcc/testsuite/gcc.target/arc/interrupt-11.c | 16 + gcc/testsuite/gcc.target/arc/interrupt-12.c | 16 + 11 files changed, 628 insertions(+), 355 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-10.c create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-11.c create mode 100644 gcc/testsuite/gcc.target/arc/interrupt-12.c diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h index f501bc30ee7..0c9f422827d 100644 --- a/gcc/config/arc/arc-protos.h +++ b/gcc/config/arc/arc-protos.h @@ -25,7 +25,6