We do not generate the vector instructions with the following code on x86_64 (and x86 -msse3): typedef short mmxw __attribute__ ((vector_size(8))); typedef int mmxdw __attribute__ ((vector_size(8))); mmxdw dw; mmxw w; void test(){ w+=w; dw= (mmxdw)w; }
The code comes from PR 14552 but we don't use the vector unit any more for the addition so we produce so much crappy code: movq w(%rip), %xmm0 movabsq $-9223231297218904064, %rax movq %xmm0, -8(%rsp) movq -8(%rsp), %rsi movq %rsi, %rcx movq %rsi, %rdx xorq %rsi, %rcx andq %rax, %rcx movabsq $9223231297218904063, %rax andq %rax, %rdx addq %rdx, %rdx xorq %rdx, %rcx movq %rcx, -16(%rsp) movq -16(%rsp), %xmm0 movq %xmm0, w(%rip) movq %xmm0, dw(%rip) ret Compared to what we got in 3.4: test: movq w, %mm1 psllw $1, %mm1 movq %mm1, w movq w, %mm0 movq %mm0, dw ret -- Summary: [4.0 Regression] missed optimization with size of 8 vectors Product: gcc Version: 4.0.0 Status: UNCONFIRMED Keywords: missed-optimization, ssemmx Severity: normal Priority: P2 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: pinskia at gcc dot gnu dot org CC: gcc-bugs at gcc dot gnu dot org GCC target triplet: x86_64-*-* OtherBugsDependingO 14552 nThis: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19391