Richard,

I don't necessarily disagree with anything in your comments and long
term I think that is the right direction, but I wonder if that level of
change is appropriate for GCC Stage 4 which is where we are now.  Your
changes would require fixes in shared code, whereas setting
REG_ALLOC_ORDER only affects Aarch64 and seems like a safer change.
I am not sure how long it would take me to implement something along
the lines of what you are suggesting.

Steve Ellcey
sell...@marvell.com


On Sat, 2019-03-09 at 08:03 +0000, Richard Sandiford wrote:

> Steve Ellcey <sell...@marvell.com> writes:
> > This is a patch to fix the register allocation in SIMD functions.  In
> > normal functions registers V16 to V31 are all caller saved.  In SIMD
> > functions V16 to V23 are callee saved and V24 to V31 are caller saved.
> > This means that SIMD functions should use V24 to V31 before V16 to V23
> > in order to avoid some saves and restores.
> > 
> > My fix for this is to define REG_ALLOC_ORDER.  Right now it is not
> > defined on Aarch64, so I just defined it as all the registers in order
> > except for putting V24 to V31 before V16 to V23.  This fixes the
> > register allocation in SIMD functions.  It also changes the register
> > allocation order in regular functions but since all the registers (V16
> > to V31) are caller saved in that case, it doesn't matter.  We could use
> > ADJUST_REG_ALLOC_ORDER to only affect SIMD functions but I did not see
> > any reason to do that.
> 
> REG_ALLOC_ORDER shouldn't really be needed for testcases like the ones
> in the PR.  Like you say, we don't currently need it to handle the
> equivalent situation for the standard ABI.
> 
> I think the problem is that the RA is still using the register sets
> for the default ABI when evaluating the prologue/epilogue cost of using
> a hard register.  E.g. see calculate_saved_nregs.
> 
> Maybe one fix would be to add an equivalent of call_used_reg_set to
> rtl_data.  By default it would be the same as call_used_reg_set,
> but the target could have an opportunity to change it.  Then code like
> calculate_saved_nregs should use the new set to find out what registers
> the current function can use without spilling.
> 
> This would also be useful for targets that implement interrupt handler
> attributes.
> 
> It would be good to add the testcase in the PR to the testsuite,
> with a scan-assembler to check for spills.
> 
> > diff --git a/gcc/config/aarch64/aarch64.h
> > b/gcc/config/aarch64/aarch64.h
> > index 7bd3bf5..d3723ff 100644
> > --- a/gcc/config/aarch64/aarch64.h
> > +++ b/gcc/config/aarch64/aarch64.h
> > @@ -328,7 +328,9 @@ extern unsigned aarch64_architecture_version;
> >     ZR              zero register, encoded as X/R31 elsewhere
> >  
> >     32 x 128-bit floating-point/vector registers
> > -   V16-V31 Caller-saved (temporary) registers
> > +   V24-V31 Caller-saved (temporary) registers
> > +   V16-V23 Caller-saved (temporary) registers in most functions;
> > +           Callee-saved in SIMD functions.
> >     V8-V15  Callee-saved registers
> >     V0-V7   Parameter/result registers
> 
> Probably better as s/SIMD/vector PCS/.  The hunk above is OK with
> that
> change, independently of the rest.
> 
> Thanks,
> Richard

Reply via email to