------- Comment #13 from dwarak dot rajagopal at amd dot com 2009-02-06 22:35
-------
> The patch makes GCC to generate movaps load followed by addps. On Core 2 it
> speeds up the testcase from 7s to 6.2s so I guess it works as expected.
>
> The same however does not reproduce on AMD box and I am not sure if it is just
> coincidence here or if really core preffer to split read-execute SSE
> operations
> (it is not recommended by the manual).
fyi, AMD (amdfam10) prefers load-execute rather than having separate load and
execute instructions.
--
dwarak dot rajagopal at amd dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dwarak dot rajagopal at amd
| |dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38824