------- Comment #7 from meissner at linux dot vnet dot ibm dot com 2010-03-22 22:24 ------- Subject: Re: Powerpc generates worse code for -mvsx on gromacs even though there are no VSX instructions used
On Mon, Mar 22, 2010 at 10:20:21PM -0000, vmakarov at redhat dot com wrote: > > > ------- Comment #6 from vmakarov at redhat dot com 2010-03-22 22:20 ------- > (In reply to comment #4) > > FWIW, I seem to get considerably worse code from mainline than you -- for > > -O3 > > -ffast-math -mcpu=power7 -mvsx -maltivec I get 140 stfs and 192 lfs insns > > (compared to 117 & 139 respectively that you reported). > > > > I suspect the differnce is because Mike calculated only stfs/lfs and you > stfs(x)/lfs(x). But may be I am wrong. I only calculated the stores and loads to the stack, i.e. egrep (stfs|lfs).*\(1\) since I was just looking for the spills. > > Just for fun, I ran the same code through the a ppc compiler with the LRS > > code > > from reload-v2 and get 133:178 stfs/lsf insns, so that code clearly is > > helping, > > but it's not enough to offset the badness shown by IRA. > > > > > > I couldn't reconcile how -fno-ira-share-spill-slots would be changing the > > number of load/store insns, so I poked at that a bit. > > Yes, I cannot understand that too. Note, while -fno-ira-share-spill-slots as fewer spills, I just measured the results on the machine, and the time spent is pretty much the same. > > -fno-ira-share-spill-slots twiddles whether or not a pseudo which gets > > assigned > > a hard reg is put into live_throughout or dead_or_set_p in the reload chain > > structures, which in turn changes what pseudos get reassigned hard regs > > during > > reload. This is a somewhat odd effect and should be investigated further. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43413