http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54400
Bug #: 54400
Summary: recognize haddpd
Classification: Unclassified
Product: gcc
Version: 4.8.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: [email protected]
ReportedBy: [email protected]
Target: x86_64-linux-gnu
Hello,
for this program:
#include <x86intrin.h>
double f(__m128d v){return v[1]+v[0];}
gcc -O3 -msse4 (same with -Os) generates:
movapd %xmm0, %xmm2
unpckhpd %xmm2, %xmm2
movapd %xmm2, %xmm1
addsd %xmm0, %xmm1
movapd %xmm1, %xmm0
(yes, the number of mov instructions is a bit high...)
Looking at the x86 backend, it can expand reduc_splus_v2df and
__builtin_ia32_haddpd, but it doesn't provide any pattern that could be
recognized. hsubpd is even less present.
It seems to me that, considering only the low part of the result of haddpd, the
pattern should be small enough to be matched: (plus (vec_select (match_operand
1) const_a) (vec_select (match_dup 1) const_b)) where a and b are 0 and 1 in
any order.