Richard Biener <richard.guent...@gmail.com> writes:
> On Tue, Mar 12, 2019 at 10:50 AM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Steve Ellcey <sell...@marvell.com> writes:
>> > Richard,
>> >
>> > I don't necessarily disagree with anything in your comments and long
>> > term I think that is the right direction, but I wonder if that level of
>> > change is appropriate for GCC Stage 4 which is where we are now.  Your
>> > changes would require fixes in shared code, whereas setting
>> > REG_ALLOC_ORDER only affects Aarch64 and seems like a safer change.
>>
>> I guess it's weighing invasiveness in terms of lines of code/location of
>> code vs. invasiveness in terms of the effect the code has.  Changing
>> REG_ALLOC_ORDER affects all functinos that use significant numbers of
>> SIMD registers, which could have unexpected knock-on effects like
>> performance regressions or exposing latent bugs.  It would also make
>> the RA try V24 before V16 for callers of vector PCS functions,
>> even though they should prefer V16.
>>
>> Keeping the change specific to callees that use the new vector PCS
>> feature seems more conservative from that point of view.
>>
>> > I am not sure how long it would take me to implement something along
>> > the lines of what you are suggesting.
>>
>> OK, in that case, I thought I'd give a go.
>>
>> This patch adds a new target hook that says which registers a function
>> can use without saving them in the prologue and restoring them in the
>> epilogue.  By default this is call_used_reg_set.
>>
>> The patch only makes IRA use the hook, and in that context it's just
>> an (important) optimisation.  But I think we need the same thing for
>> passes like regrename, where it'll be a correctness issue.  I'll follow
>> on with patches there if this one is OK.
>>
>> Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
>> OK to install?
>
> So why is call_used_reg_set not updated appropriately?  As you say
> if you introduce sth else it becomes a correctness issue to use that "else"
> rather than call_used_reg_set which was OK up until now.  I wonder how
> other targets with a per-function different ABI handle this situation.  Or are
> you the first and others at most affect calling conventions here?  That is,
> it looks like making (parts of?) this_taret_hard_regs per function or
> re-computed at set_cfun time.  There's alrady a hook for that btw,
> targetm.set_current_function ().  Why not change call_used_reg_set there?

The problem is that call_used_reg_set is a global that describes the
base ABI.  It's used both for what the current function can use without
saving and restoring, and for what the target of a call insn might clobber.
Most uses are for the latter.

The global set still needs to describe the base ABI, since the default
assumption is that a call insn uses the base ABI unless something says
otherwise.  TARGET_HARD_REGNO_CALL_PART_CLOBBERED can be used for call
insns that use other ABIs, so it handles the case in which the current
function is the caller.  But there's no hook for handling other ABIs
when the current function is the callee.

So the purpose of the new hook is to split out what the current function
needs to save and restore as a separate bit of data that doesn't affect
call insns.  Since it's an rtl-level thing, I thought it'd be better
to put it in crtl (i.e. the rtl info about the current function).
That way we only need to set it once at the start of the rtl passes,
rather than every time we change cfun via targetm.set_current_function.
(Well, longer-term, we might want to set it after RA too, so that the
scratch registers only include what the prologue and epilogue actually
save & restore.)

Targets can already use targetm.hard_regno_scratch_ok and HARD_REGNO_RENAME_OK
to stop the post-RA passes using a particular register.  I guess that's
been enough for what targets have needed till now, and we could use them
for the correctness thing I mentioned above.  But that wouldn't solve the
RA problem.  Making the information directly available seems better than
adding more hacks around it.

Thanks,
Richard

Reply via email to