http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499
Bug #: 51499
Summary: vectorizer missing simple case
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: [email protected]
ReportedBy: [email protected]
The sse vectorizer seems to miss one of the simplest cases:
#include <cstdio>
#include <cstdlib>
double loop(double a, size_t n){
// initialise differently so compiler doesn't simplify
double sum1=0.1, sum2=0.2, sum3=0.3, sum4=0.4, sum5=0.5, sum6=0.6;
for(size_t i=0; i<n; i++){
sum1+=a; sum2+=a; sum3+=a; sum4+=a; sum5+=a; sum6+=a;
}
return sum1+sum2+sum3+sum4+sum5+sum6-2.1-6.0*a*n;
}
int main(int argc, char** argv) {
size_t n=1000000;
double a=1.1;
printf("res=%f\n", loop(a,n));
return EXIT_SUCCESS;
}
g++-4.6.2 -Wall -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 test.cpp
test.cpp:7: note: not vectorized: unsupported use in stmt.
test.cpp:4: note: vectorized 0 loops in function.
We get six addsd operations - whereas an optimisation should have
given us three addpd operations.
.L3:
addq $1, %rax
addsd %xmm0, %xmm6
cmpq %rdi, %rax
addsd %xmm0, %xmm5
addsd %xmm0, %xmm4
addsd %xmm0, %xmm3
addsd %xmm0, %xmm2
addsd %xmm0, %xmm1
jne .L3