Compiling the following function on x86 with options -O -fomit-frame-pointer
-msse -S seems to result in erroneous code.
#include <xmmintrin.h>
__m128 bug(__m128 a, __m128 b) {
__m128 c = _mm_sub_ps(a, b);
return _mm_move_ss(c, a);
}
This is what the function in the resulting .s file looke like:
.file "gccbug.c"
.text
.align 2
.globl __Z3bugU8__vectorfS_
.def __Z3bugU8__vectorfS_; .scl 2; .type 32; .endef
__Z3bugU8__vectorfS_:
subss %xmm1, %xmm0
ret
According to
http://www.intel.com/software/products/compilers/clin/docs/ug_cpp/comm1030.htm
_mm_move_ss passes the upper three values through, but the code generated
doesn't even calculate those values. I had expected the code to look more like
this:
movaps %xmm0, %xmm2
subps %xmm1, %xmm0
movss %xmm2, %xmm0
Hope that turned out right, I'm not experienced with AT&T syntax. Is this a bug,
or have I misunderstood something? I get similar results with 3.3.4.
Torgeir
--
Summary: _mm_move_ss SSE intrinsic causes erroneous
Product: gcc
Version: 3.4.2
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: torgeihe at stud dot ntnu dot no
CC: gcc-bugs at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18503