On Fri, Mar 22, 2019 at 05:35:02PM +0000, James Greenhalgh wrote: > On Mon, Mar 11, 2019 at 04:10:15PM +0000, Steve Ellcey wrote: > > Richard, > > > > I don't necessarily disagree with anything in your comments and long > > term I think that is the right direction, but I wonder if that level of > > change is appropriate for GCC Stage 4 which is where we are now. Your > > changes would require fixes in shared code, whereas setting > > REG_ALLOC_ORDER only affects Aarch64 and seems like a safer change. > > I am not sure how long it would take me to implement something along > > the lines of what you are suggesting. > > I'll leave it to Richard to decide, but your workaround seems like the > right level of risk for this time of the release. I'd be happy taking it.
Excuse me, I missed the fork in this thread (didn't hit my "AArch64" filter). I'll try to take a look at Richard's approach early next week. Thanks, James > > Thanks, > James > > > > > On Sat, 2019-03-09 at 08:03 +0000, Richard Sandiford wrote: > > > > > Steve Ellcey <sell...@marvell.com> writes: > > > > This is a patch to fix the register allocation in SIMD functions. In > > > > normal functions registers V16 to V31 are all caller saved. In SIMD > > > > functions V16 to V23 are callee saved and V24 to V31 are caller saved. > > > > This means that SIMD functions should use V24 to V31 before V16 to V23 > > > > in order to avoid some saves and restores. > > > > > > > > My fix for this is to define REG_ALLOC_ORDER. Right now it is not > > > > defined on Aarch64, so I just defined it as all the registers in order > > > > except for putting V24 to V31 before V16 to V23. This fixes the > > > > register allocation in SIMD functions. It also changes the register > > > > allocation order in regular functions but since all the registers (V16 > > > > to V31) are caller saved in that case, it doesn't matter. We could use > > > > ADJUST_REG_ALLOC_ORDER to only affect SIMD functions but I did not see > > > > any reason to do that. > > > > > > REG_ALLOC_ORDER shouldn't really be needed for testcases like the ones > > > in the PR. Like you say, we don't currently need it to handle the > > > equivalent situation for the standard ABI. > > > > > > I think the problem is that the RA is still using the register sets > > > for the default ABI when evaluating the prologue/epilogue cost of using > > > a hard register. E.g. see calculate_saved_nregs. > > > > > > Maybe one fix would be to add an equivalent of call_used_reg_set to > > > rtl_data. By default it would be the same as call_used_reg_set, > > > but the target could have an opportunity to change it. Then code like > > > calculate_saved_nregs should use the new set to find out what registers > > > the current function can use without spilling. > > > > > > This would also be useful for targets that implement interrupt handler > > > attributes. > > > > > > It would be good to add the testcase in the PR to the testsuite, > > > with a scan-assembler to check for spills. > > > > > > > diff --git a/gcc/config/aarch64/aarch64.h > > > > b/gcc/config/aarch64/aarch64.h > > > > index 7bd3bf5..d3723ff 100644 > > > > --- a/gcc/config/aarch64/aarch64.h > > > > +++ b/gcc/config/aarch64/aarch64.h > > > > @@ -328,7 +328,9 @@ extern unsigned aarch64_architecture_version; > > > > ZR zero register, encoded as X/R31 elsewhere > > > > > > > > 32 x 128-bit floating-point/vector registers > > > > - V16-V31 Caller-saved (temporary) registers > > > > + V24-V31 Caller-saved (temporary) registers > > > > + V16-V23 Caller-saved (temporary) registers in most functions; > > > > + Callee-saved in SIMD functions. > > > > V8-V15 Callee-saved registers > > > > V0-V7 Parameter/result registers > > > > > > Probably better as s/SIMD/vector PCS/. The hunk above is OK with > > > that > > > change, independently of the rest. > > > > > > Thanks, > > > Richard