Hi Jeff, Originally, I had the scheduler barrier as you suggested. However, there were some user cases when an ISR messed up with SP register leading to errors. As a solution was to add barriers on either part of frame operations. However, I would need to recheck the original rationale of that issue, as it may not be the case with newer cores.
On the other hand, I found a small error of the current patch when one is having the fno-omit-frame-pointer option in the order how some auxiliary registers are saved. What will be the best solution having in mind that I haven't pushed this patch to the mainline yet: - to have the current patch re-spin? - to have a separate patch for the new issue. Thanks, Claudiu On Wed, Jul 3, 2019 at 2:15 AM Jeff Law <l...@redhat.com> wrote: > > On 6/28/19 7:39 AM, Claudiu Zissulescu wrote: > > When entering an interrupt, not only the call save registers needs to > > be place on stack but also the call clobbers one. More over, the > > ARC700 return from interrupt instruction needs to be rtie, the same > > like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX] > > instruction. Additionally, we need to save the state of the ZOL > > machinery, namely the lp_count, lp_end and lp_start registers. For > > architectures which are using extension registers (i.e., HS48) we need > > to save/restore them as well. > > > > Ok to apply? > > Claudiu > > > > gcc/ > > xxxx-xx-xx Claudiu Zissulescu <claz...@synopsys.com> > > > > * config/arc/arc-protos.h (arc_output_function_epilogue): Delete > > declaration. > > (arc_return_address_register): Likewise. > > (arc_compute_function_type): Likewise. > > (arc_compute_frame_size): Likewise. > > (secondary_reload_info): Likewise. > > (arc_get_unalign): Likewise. > > (arc_can_use_return_insn): Declare. > > * config/arc/arc.c (AUX_LP_START): Define > > (AUX_LP_END): Likewise. > > (arc_frame_info): Update gmask member to 64-bit datum. > > (GMASK_LEN): Update. > > (arc_compute_function_type): Make it static, move it forward. > > (arc_must_save_register): Update, consider the extra regs. > > (arc_compute_millicode_save_restore_regs): Update to use the 64 > > bit gmask. > > (arc_compute_frame_size): Likewise. > > (arc_enter_leave_p): Likewise. > > (arc_save_callee_saves): Likewise. > > (arc_restore_callee_saves): Likewise. > > (arc_save_callee_enter): Likewise. > > (arc_restore_callee_leave): Likewise. > > (arc_save_callee_milli): Likewise. > > (arc_restore_callee_milli): Likewise. > > (arc_expand_prologue): Add new interrupt handling. > > (arc_return_address_register): Make it static, move it forward. > > (arc_expand_epilogue): Add new interrupt handling. > > (arc_get_unalign): Delete. > > (arc_epilogue_uses): Make sure we do not remove the extra > > saved/restored registers when interrupt. > > (arc_can_use_return_insn): New function. > > * config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define. > > (R33_REG): Likewise. > > (R34_REG): Likewise. > > (R35_REG): Likewise. > > (R36_REG): Likewise. > > (R37_REG): Likewise. > > (R38_REG): Likewise. > > (R39_REG): Likewise. > > (R45_REG): Likewise. > > (R46_REG): Likewise. > > (R47_REG): Likewise. > > (R48_REG): Likewise. > > (R49_REG): Likewise. > > (R50_REG): Likewise. > > (R51_REG): Likewise. > > (R52_REG): Likewise. > > (R53_REG): Likewise. > > (R54_REG): Likewise. > > (R55_REG): Likewise. > > (R56_REG): Likewise. > > (R58_REG): Likewise. > > (type): Add rtie attribute. > > (in_call_delay_slot): Use RETURN_ADDR_REGNUM. > > (movsi_insn): Accept moves to lp_count. > > (rtie): Update pattern. > > (simple_return): Simplify it, don't use this pattern as a return > > from an interrupt. > > (arc600_rtie): New pattern. > > (p_return_i): Clean up. > > (return): Likewise. > > * config/arc/builtins.def (rtie): Only available for non ARC6xx > > family CPUs. > > * config/arc/predicates.md (move_src_operand): Consider lp_count > > as a register. > > > > gcc/testsuite > > xxxx-xx-xx Claudiu Zissulescu <claz...@synopsys.com> > > > > * gcc.target/arc/arc.exp (check_effective_target_accregs): New > > predicate. > > * gcc.target/arc/builtin_special.c: Update test/ > > * gcc.target/arc/interrupt-1.c: Likewise. > > * gcc.target/arc/interrupt-10.c: New test. > > * gcc.target/arc/interrupt-11.c: Likewise. > > --- > > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c > > index 4fec8fd8929..910c8d2ad30 100644 > > --- a/gcc/config/arc/arc.c > > +++ b/gcc/config/arc/arc.c > > @@ -206,6 +206,13 @@ static int rgf_banked_register_count; > > this to be no less than the 1/p */ > > #define MAX_INSNS_SKIPPED 3 > > > > +/* ZOL control registers. */ > > +#define AUX_LP_START 0x02 > > +#define AUX_LP_END 0x03 > > + > > +/* FPX AUX registers. */ > > +#define AUX_DPFP_START 0x301 > > + > > /* A nop is needed between a 4 byte insn that sets the condition codes and > > a branch that uses them (the same isn't true for an 8 byte insn that > > sets > > the condition codes). Set by arc_ccfsm_advance. Used by > > @@ -298,6 +305,131 @@ static bool arc_use_by_pieces_infrastructure_p > > (unsigned HOST_WIDE_INT, > > /* Globally visible information about currently selected cpu. */ > > const arc_cpu_t *arc_selected_cpu; > > > > +/* Traditionally, we push saved registers first in the prologue, > > + then we allocate the rest of the frame - and reverse in the epilogue. > > + This has still its merits for ease of debugging, or saving code size > > + or even execution time if the stack frame is so large that some accesses > > + can't be encoded anymore with offsets in the instruction code when using > > + a different scheme. > > + Also, it would be a good starting point if we got instructions to help > > + with register save/restore. > > + > > + However, often stack frames are small, and the pushing / popping has > > + some costs: > > + - the stack modification prevents a lot of scheduling. > > + - frame allocation / deallocation needs extra instructions. > > + - unless we know that we compile ARC700 user code, we need to put > > + a memory barrier after frame allocation / before deallocation to > > + prevent interrupts clobbering our data in the frame. > Note that you need a scheduler barrier before frame deallocation, but > not after frame allocation. At least that's the general case. > > > > + In particular, we don't have any such guarantees for library > > functions, > > + which tend to, on the other hand, to have small frames.> + > > + Thus, for small frames, we'd like to use a different scheme: > > + - The frame is allocated in full with the first prologue instruction, > > + and deallocated in full with the last epilogue instruction. > > + Thus, the instructions in-betwen can be freely scheduled. > s/betwen/between/ > > > > + - If the function has no outgoing arguments on the stack, we can > > allocate > > + one register save slot at the top of the stack. This register can > > then > > + be saved simultanously with frame allocation, and restored with > s/simultanously/simultaneously/ > > > > OK with the nits fixed. > > jeff