http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57601
Bug ID: 57601 Summary: Vector lowering could use larger modes Product: gcc Version: 4.9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target: x86_64-linux-gnu typedef int vec __attribute__((vector_size(2*sizeof(int)))); vec f(vec a, vec b){ return a-b; } vmovq %xmm0, %rcx vmovq %xmm1, %rdx movl %ecx, %eax shrq $32, %rcx subl %edx, %eax shrq $32, %rdx subl %edx, %ecx vmovd %eax, %xmm2 vpinsrd $1, %ecx, %xmm2, %xmm0 (with -Ofast -mavx2) whereas if I change the size to 4, I get: vpsubd %xmm1, %xmm0, %xmm0 which seems valid to me even for size 2. It is not clear to me how to model that at tree level, maybe it would be easier to just implement V2SI operations in the backend?