http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499
--- Comment #7 from fb.programming at gmail dot com 2011-12-11 14:55:13 UTC ---
(In reply to comment #5)
> > (3) If I change all double's into float's in the code above it seems to
> I think you are looking at the scalar epilogue. The number of iterations is
> unknown, so we need an epilogue loop for the case that number of iterations is
> not a multiple of 4.
Yes you're right. Sorry about that, my mistake.
> > (1) In this case it should work without -funsafe-math-optimizations but
> > it doesn't. gcc 4.7 requires -fno-signed-zeros -fno-trapping-math
> > -fassociative-math to make it work.
> >
>
> It's reduction, when we vectorize we change the order of computation. In order
> to be able to do that for floating point we need flag_associative_math.
In some cases it might be necessary but not here:
sum1+=a;
sum2+=a;
gives exactly the same result as
(sum1, sum2) += (a, a);
Lets take a more applied example, say calculating the sum of 1/i:
double harmon(int n) {
double sum=0.0;
for(int i=1; i<n; i++){
sum += 1.0/i;
}
return sum;
}
This requires reordering of the sum to be vectorized, so in this case
I agree we need -funsafe-math-optimizations.
However, one could manually split the sum
double harmon(int n) {
assert(n%2==0);
double sum1=0.0, sum2=0.0;
for(int i=1; i<n; i+=2){
sum1 += 1.0/i;
sum2 += 1.0/(i+1);
}
return sum1+sum2;
}
and now I'd expect the compiler to vectorize this without
-funsafe-math-optimizations as it doesn't change any computational
results:
(sum1, sum2) += (1.0/i, 1.0/(i+1));
I can attach a test case with that example if that'd be useful?