------- Comment #7 from hubicka at ucw dot cz  2009-01-15 01:49 -------
Subject: Re:  [4.4 regression] performance regression of sse code from 4.2/4.3

I guess th3 main difference here is that load + addps pair generate 2
uops, while mov + loading addps generate 3 since the move has to go
through the queue.  I will try to change testcase to fit in cache to see
if AMD machine reproduce it too..

Honza


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38824

Reply via email to