On Fri, Nov 18, 2016 at 05:07:21PM -0600, Segher Boessenkool wrote: > On Fri, Nov 18, 2016 at 05:52:12PM -0500, Michael Meissner wrote: > > On Fri, Nov 18, 2016 at 04:43:40PM -0600, Segher Boessenkool wrote: > > > Could you also test with reload please? Just LE is enough I guess. > > > We'd like to keep reload working for GCC 7 at least, and these cost > > > prefixes tend to break mov patterns :-/ > > > > Argh, I guess you are right, but then if reload doesn't work, I will likely > > submit the patch where there are three different movdi's (one for 32-bit > > without the change, one for 64-bit with reload, and one for 64-bit with > > lra). > > I would prefer not to do that. > > Let's hope it just works :-)
I did test it over the weekend. 29 of the 30 spec 2006 benchmarks currently build with reload (gamess fails). The same 29 build and run with the new patch. Like the patch under LRA, there are no regressions in performance, and one FP benchmark is faster. Under LRA, sphinx3 is 2.5% faster (compared to LRA without the patch). Under reload, sphinx3 is roughly the same performance, but calculix is 3.8% faster. Can I apply the patch? -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797