Hi Jeff,

Originally, I had the scheduler barrier as you suggested. However,
there were some user cases when an ISR messed up with SP register
leading to errors. As a solution was to add barriers on either part of
frame operations. However, I would need to recheck the original
rationale of that issue, as it may not be the case with newer cores.

On the other hand, I found a small error of the current patch when one
is having the fno-omit-frame-pointer option in the order how some
auxiliary registers are saved. What will be the best solution having
in mind that I haven't pushed this patch to the mainline yet:
- to have the current patch re-spin?
- to have a separate patch for the new issue.

Thanks,
Claudiu

On Wed, Jul 3, 2019 at 2:15 AM Jeff Law <l...@redhat.com> wrote:
>
> On 6/28/19 7:39 AM, Claudiu Zissulescu wrote:
> > When entering an interrupt, not only the call save registers needs to
> > be place on stack but also the call clobbers one. More over, the
> > ARC700 return from interrupt instruction needs to be rtie, the same
> > like ARCv2 CPUs. While the ARC6xx family uses j.f [ilinkX]
> > instruction. Additionally, we need to save the state of the ZOL
> > machinery, namely the lp_count, lp_end and lp_start registers. For
> > architectures which are using extension registers (i.e., HS48) we need
> > to save/restore them as well.
> >
> > Ok to apply?
> > Claudiu
> >
> > gcc/
> > xxxx-xx-xx  Claudiu Zissulescu  <claz...@synopsys.com>
> >
> >       * config/arc/arc-protos.h (arc_output_function_epilogue): Delete
> >       declaration.
> >       (arc_return_address_register): Likewise.
> >       (arc_compute_function_type): Likewise.
> >       (arc_compute_frame_size): Likewise.
> >       (secondary_reload_info): Likewise.
> >       (arc_get_unalign): Likewise.
> >       (arc_can_use_return_insn): Declare.
> >       * config/arc/arc.c (AUX_LP_START): Define
> >       (AUX_LP_END): Likewise.
> >       (arc_frame_info): Update gmask member to 64-bit datum.
> >       (GMASK_LEN): Update.
> >       (arc_compute_function_type): Make it static, move it forward.
> >       (arc_must_save_register): Update, consider the extra regs.
> >       (arc_compute_millicode_save_restore_regs): Update to use the 64
> >       bit gmask.
> >       (arc_compute_frame_size): Likewise.
> >       (arc_enter_leave_p): Likewise.
> >       (arc_save_callee_saves): Likewise.
> >       (arc_restore_callee_saves): Likewise.
> >       (arc_save_callee_enter): Likewise.
> >       (arc_restore_callee_leave): Likewise.
> >       (arc_save_callee_milli): Likewise.
> >       (arc_restore_callee_milli): Likewise.
> >       (arc_expand_prologue): Add new interrupt handling.
> >       (arc_return_address_register): Make it static, move it forward.
> >       (arc_expand_epilogue): Add new interrupt handling.
> >       (arc_get_unalign): Delete.
> >       (arc_epilogue_uses): Make sure we do not remove the extra
> >       saved/restored registers when interrupt.
> >       (arc_can_use_return_insn): New function.
> >       * config/arc/arc.md (VUNSPEC_ARC_ARC600_RTIE): Define.
> >       (R33_REG): Likewise.
> >       (R34_REG): Likewise.
> >       (R35_REG): Likewise.
> >       (R36_REG): Likewise.
> >       (R37_REG): Likewise.
> >       (R38_REG): Likewise.
> >       (R39_REG): Likewise.
> >       (R45_REG): Likewise.
> >       (R46_REG): Likewise.
> >       (R47_REG): Likewise.
> >       (R48_REG): Likewise.
> >       (R49_REG): Likewise.
> >       (R50_REG): Likewise.
> >       (R51_REG): Likewise.
> >       (R52_REG): Likewise.
> >       (R53_REG): Likewise.
> >       (R54_REG): Likewise.
> >       (R55_REG): Likewise.
> >       (R56_REG): Likewise.
> >       (R58_REG): Likewise.
> >       (type): Add rtie attribute.
> >       (in_call_delay_slot): Use RETURN_ADDR_REGNUM.
> >       (movsi_insn): Accept moves to lp_count.
> >       (rtie): Update pattern.
> >       (simple_return): Simplify it, don't use this pattern as a return
> >       from an interrupt.
> >       (arc600_rtie): New pattern.
> >       (p_return_i): Clean up.
> >       (return): Likewise.
> >       * config/arc/builtins.def (rtie): Only available for non ARC6xx
> >       family CPUs.
> >       * config/arc/predicates.md (move_src_operand): Consider lp_count
> >       as a register.
> >
> > gcc/testsuite
> > xxxx-xx-xx  Claudiu Zissulescu  <claz...@synopsys.com>
> >
> >       * gcc.target/arc/arc.exp (check_effective_target_accregs): New
> >       predicate.
> >       * gcc.target/arc/builtin_special.c: Update test/
> >       * gcc.target/arc/interrupt-1.c: Likewise.
> >       * gcc.target/arc/interrupt-10.c: New test.
> >       * gcc.target/arc/interrupt-11.c: Likewise.
> > ---
> > diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
> > index 4fec8fd8929..910c8d2ad30 100644
> > --- a/gcc/config/arc/arc.c
> > +++ b/gcc/config/arc/arc.c
> > @@ -206,6 +206,13 @@ static int rgf_banked_register_count;
> >     this to be no less than the 1/p  */
> >  #define MAX_INSNS_SKIPPED 3
> >
> > +/* ZOL control registers.  */
> > +#define AUX_LP_START 0x02
> > +#define AUX_LP_END 0x03
> > +
> > +/* FPX AUX registers.  */
> > +#define AUX_DPFP_START 0x301
> > +
> >  /* A nop is needed between a 4 byte insn that sets the condition codes and
> >     a branch that uses them (the same isn't true for an 8 byte insn that 
> > sets
> >     the condition codes).  Set by arc_ccfsm_advance.  Used by
> > @@ -298,6 +305,131 @@ static bool arc_use_by_pieces_infrastructure_p 
> > (unsigned HOST_WIDE_INT,
> >  /* Globally visible information about currently selected cpu.  */
> >  const arc_cpu_t *arc_selected_cpu;
> >
> > +/* Traditionally, we push saved registers first in the prologue,
> > +   then we allocate the rest of the frame - and reverse in the epilogue.
> > +   This has still its merits for ease of debugging, or saving code size
> > +   or even execution time if the stack frame is so large that some accesses
> > +   can't be encoded anymore with offsets in the instruction code when using
> > +   a different scheme.
> > +   Also, it would be a good starting point if we got instructions to help
> > +   with register save/restore.
> > +
> > +   However, often stack frames are small, and the pushing / popping has
> > +   some costs:
> > +   - the stack modification prevents a lot of scheduling.
> > +   - frame allocation / deallocation needs extra instructions.
> > +   - unless we know that we compile ARC700 user code, we need to put
> > +     a memory barrier after frame allocation / before deallocation to
> > +     prevent interrupts clobbering our data in the frame.
> Note that you need a scheduler barrier before frame deallocation, but
> not after frame allocation.  At least that's the general case.
>
>
> > +     In particular, we don't have any such guarantees for library 
> > functions,
> > +     which tend to, on the other hand, to have small frames.> +
> > +   Thus, for small frames, we'd like to use a different scheme:
> > +   - The frame is allocated in full with the first prologue instruction,
> > +     and deallocated in full with the last epilogue instruction.
> > +     Thus, the instructions in-betwen can be freely scheduled.
> s/betwen/between/
>
>
> > +   - If the function has no outgoing arguments on the stack, we can 
> > allocate
> > +     one register save slot at the top of the stack.  This register can 
> > then
> > +     be saved simultanously with frame allocation, and restored with
> s/simultanously/simultaneously/
>
>
>
> OK with the nits fixed.
>
> jeff

Reply via email to