------- Comment #3 from gael dot guennebaud at gmail dot com 2009-06-24 10:53 ------- There is a compilable example attached to comment #1.
Furthermore, I can reproduce the problem with gcc 4.1.3, 4.2.4, 4.3.2, and 4.4.0, so I don't think it is a duplicate of PR40141. FYI, in the meantime I workaround the issue using inline assembly: inline __m128 ploadu(const float* from) { __m128 res; asm("movsd %[from0], %[r]" : [r] "=x" (res) : [from0] "m" (*from), [dummy] "m" (*(from+1)) ); asm("movhps %[from2], %[r]" : [r] "+x" (res) : [from2] "m" (*(from+2)), [dummy] "m" (*(from+3)) ); return res; } but that's not as portable as intrinsics. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40537