Testcase:
#include <emmintrin.h>
__m128i foo;
int main()
{
foo = _mm_xor_si128(_mm_setzero_si128(), foo);
return 0;
}
Resulting Assembly (with -O3):
pxor %xmm0, %xmm0
xorl %eax, %eax
pxor foo(%rip), %xmm0
movdqa %xmm0, foo(%rip)
ret
Expected Result:
since any value xor zero does not change the value the static evaluation step
of GCC should eliminate the pxor instruction altogether.
Likewise the call to _mm_xor_si128 on two constants should be statically
evaluated and if the destination register would not change the call should be
eliminated.
Rationale:
a) of course removing unnecessary code is always nice
b) the implementation of correct unsigned integer compare can be done with pxor
and 0x80000000. All unsigned compares against constant values can thus be
improved.
--
Summary: [missed optimization] static evaluation of SSE
intrinsics (pxor)
Product: gcc
Version: 4.5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: kretz at kde dot org
GCC build triplet: x86_64-unknown-linux-gnu
GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45739