------- Comment #3 from gael dot guennebaud at gmail dot com 2009-06-24 10:53
-------
There is a compilable example attached to comment #1.
Furthermore, I can reproduce the problem with gcc 4.1.3, 4.2.4, 4.3.2, and
4.4.0, so I don't think it is a duplicate of PR40141.
FYI, in the meantime I workaround the issue using inline assembly:
inline __m128 ploadu(const float* from)
{
__m128 res;
asm("movsd %[from0], %[r]"
: [r] "=x" (res) : [from0] "m" (*from), [dummy] "m" (*(from+1)) );
asm("movhps %[from2], %[r]"
: [r] "+x" (res) : [from2] "m" (*(from+2)), [dummy] "m" (*(from+3)) );
return res;
}
but that's not as portable as intrinsics.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40537