On Wed, 23 Nov 2011, Leith Bade wrote:
I have been hand optimising a loop that GCC 4.6 was not able to vectorise.
I have been keeping an eye on the assembly output of this loop and
have noticed GCC inserting unnecessary MOVAPS instructions.
Yes, there are several bugzilla entries showing that
Hi,
I have been hand optimising a loop that GCC 4.6 was not able to vectorise.
I have been keeping an eye on the assembly output of this loop and
have noticed GCC inserting unnecessary MOVAPS instructions.
It only happens when I use the same variable for both inputs to a SSE intrinsic.
E.g:
__x