------- Comment #13 from dwarak dot rajagopal at amd dot com 2009-02-06 22:35 -------
> The patch makes GCC to generate movaps load followed by addps. On Core 2 it > speeds up the testcase from 7s to 6.2s so I guess it works as expected. > > The same however does not reproduce on AMD box and I am not sure if it is just > coincidence here or if really core preffer to split read-execute SSE > operations > (it is not recommended by the manual). fyi, AMD (amdfam10) prefers load-execute rather than having separate load and execute instructions. -- dwarak dot rajagopal at amd dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dwarak dot rajagopal at amd | |dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38824