------- Comment #4 from law at redhat dot com 2010-03-22 19:49 ------- FWIW, I seem to get considerably worse code from mainline than you -- for -O3 -ffast-math -mcpu=power7 -mvsx -maltivec I get 140 stfs and 192 lfs insns (compared to 117 & 139 respectively that you reported).
Just for fun, I ran the same code through the a ppc compiler with the LRS code from reload-v2 and get 133:178 stfs/lsf insns, so that code clearly is helping, but it's not enough to offset the badness shown by IRA. I couldn't reconcile how -fno-ira-share-spill-slots would be changing the number of load/store insns, so I poked at that a bit. -fno-ira-share-spill-slots twiddles whether or not a pseudo which gets assigned a hard reg is put into live_throughout or dead_or_set_p in the reload chain structures, which in turn changes what pseudos get reassigned hard regs during reload. This is a somewhat odd effect and should be investigated further. -- law at redhat dot com changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Powerpc generates worse code|Powerpc generates worse code |for -mvsx on gromacs even |for -mvsx on gromacs even |though there are no VSX |though there are no VSX |instructions used |instructions used http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43413