On Sun, Oct 20, 2013 at 10:48:08PM -0400, Vladimir Makarov wrote: > On 13-10-18 11:26 AM, David Edelsohn wrote: > >On Thu, Oct 3, 2013 at 5:02 PM, Vladimir Makarov <[email protected]> wrote: > >>The following patch permits today trunk to use LRA for ppc by default. > >>To switch it off -mno-lra can be used. > >> > >>The patch was bootstrapped on ppc64. GCC testsuite does not have > >>regressions too (in comparison with reload). The change in rs6000.md is > >>for fix LRA failure on a recently added ppc test. > >Vlad, > > > >I have not forgotten this patch. We are trying to figure out the right > >timeframe to make this change. The patch does affect performance -- > >both positively and negatively; most are in the noise but not all. And > >there still are some SPEC benchmarks that fail to build with the > >patch, at least in Mike's tests. And Mike is implementing some patches > >to utilize reload to improve use of VSX registers, which would need to > >be mirrored in LRA for the equivalent functionality. > Thanks for informing me, David. > > I am ready to work on any LRA ppc issues when it will be in the > trunk. It would be easier for me to work on LRA ppc if the patch is > committed to the trunk and of course LRA is used as non-default > local RA. > > I don't know what Mike is doing on reload to use VSX registers. I > guess it is usage of VSX regs as spilled locations for GENERAL regs > instead of memory. If it is so, it is 2 day work to add this > functionality in LRA (as it already has analogous functionality for > Intel processors and that gave a nice SPECFP2000 improvement for > them) and probably more work on resolving issues especially as I > have no power8.
I would say lets add -mlra, but make the default OFF for the time being. We can always switch the default later. Vladimir, I thought I included you in the list when I gave status. The big thing is several of the Spec 2006 benchmarks don't work in 32-bit mode, and I get a lot of Fortran errors, again in 32-bit. I also saw some decimal floating point problems. What I'm doing is adding secondary reload support so that up until reload time, we can represent VSX addresses as reg+offset, and in secondary reload, create the addition instructions to put the offset in a base register. I haven't made any changes to the machine independent portions of the compiler. As long as IRA uses the secondary reload interface, it should be ok. However, right now, I need to focus most of my attention on getting the secondary reload support to work. One thing that I've asked for before, but to remind you, is I really, really wish secondary reload could allocate two scratch registers if it is given an insn that takes 4 arguments. Right now, I'm allocating a TFmode scratch, since that gives 2 registers, but future changes will want TFmode to go into a single vector register, and I will need to create another type, like V4DI that does take 2 registers. The case that this is needed for is moving an item from GPRs to VSX registers that takes 2 GPR registers, such as moving 128-bit items in 64-bit mode, or 64-bit items in 32-bit mode. I need two registers to do the move into, and then I will do the combine operation. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA email: [email protected], phone: +1 (978) 899-4797
