On Mon, Mar 25, 2019 at 6:48 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Biener <richard.guent...@gmail.com> writes:
> > On Tue, Mar 12, 2019 at 10:50 AM Richard Sandiford
> > <richard.sandif...@arm.com> wrote:
> >>
> >> Steve Ellcey <sell...@marvell.com> writes:
> >> > Richard,
> >> >
> >> > I don't necessarily disagree with anything in your comments and long
> >> > term I think that is the right direction, but I wonder if that level of
> >> > change is appropriate for GCC Stage 4 which is where we are now.  Your
> >> > changes would require fixes in shared code, whereas setting
> >> > REG_ALLOC_ORDER only affects Aarch64 and seems like a safer change.
> >>
> >> I guess it's weighing invasiveness in terms of lines of code/location of
> >> code vs. invasiveness in terms of the effect the code has.  Changing
> >> REG_ALLOC_ORDER affects all functinos that use significant numbers of
> >> SIMD registers, which could have unexpected knock-on effects like
> >> performance regressions or exposing latent bugs.  It would also make
> >> the RA try V24 before V16 for callers of vector PCS functions,
> >> even though they should prefer V16.
> >>
> >> Keeping the change specific to callees that use the new vector PCS
> >> feature seems more conservative from that point of view.
> >>
> >> > I am not sure how long it would take me to implement something along
> >> > the lines of what you are suggesting.
> >>
> >> OK, in that case, I thought I'd give a go.
> >>
> >> This patch adds a new target hook that says which registers a function
> >> can use without saving them in the prologue and restoring them in the
> >> epilogue.  By default this is call_used_reg_set.
> >>
> >> The patch only makes IRA use the hook, and in that context it's just
> >> an (important) optimisation.  But I think we need the same thing for
> >> passes like regrename, where it'll be a correctness issue.  I'll follow
> >> on with patches there if this one is OK.
> >>
> >> Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
> >> OK to install?
> >
> > So why is call_used_reg_set not updated appropriately?  As you say
> > if you introduce sth else it becomes a correctness issue to use that "else"
> > rather than call_used_reg_set which was OK up until now.  I wonder how
> > other targets with a per-function different ABI handle this situation.  Or 
> > are
> > you the first and others at most affect calling conventions here?  That is,
> > it looks like making (parts of?) this_taret_hard_regs per function or
> > re-computed at set_cfun time.  There's alrady a hook for that btw,
> > targetm.set_current_function ().  Why not change call_used_reg_set there?
>
> The problem is that call_used_reg_set is a global that describes the
> base ABI.  It's used both for what the current function can use without
> saving and restoring, and for what the target of a call insn might clobber.
> Most uses are for the latter.
>
> The global set still needs to describe the base ABI, since the default
> assumption is that a call insn uses the base ABI unless something says
> otherwise.  TARGET_HARD_REGNO_CALL_PART_CLOBBERED can be used for call
> insns that use other ABIs, so it handles the case in which the current
> function is the caller.  But there's no hook for handling other ABIs
> when the current function is the callee.
>
> So the purpose of the new hook is to split out what the current function
> needs to save and restore as a separate bit of data that doesn't affect
> call insns.  Since it's an rtl-level thing, I thought it'd be better
> to put it in crtl (i.e. the rtl info about the current function).
> That way we only need to set it once at the start of the rtl passes,
> rather than every time we change cfun via targetm.set_current_function.
> (Well, longer-term, we might want to set it after RA too, so that the
> scratch registers only include what the prologue and epilogue actually
> save & restore.)
>
> Targets can already use targetm.hard_regno_scratch_ok and HARD_REGNO_RENAME_OK
> to stop the post-RA passes using a particular register.  I guess that's
> been enough for what targets have needed till now, and we could use them
> for the correctness thing I mentioned above.  But that wouldn't solve the
> RA problem.  Making the information directly available seems better than
> adding more hacks around it.

Fair enough.  I'm not too deep into this but it seems to me this is an
area that needs proper documentation and a well-designed extension - both
don't seem like a good fit for stage 4.

Richard.

> Thanks,
> Richard

Reply via email to