http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51499

--- Comment #7 from fb.programming at gmail dot com 2011-12-11 14:55:13 UTC ---
(In reply to comment #5)

> > (3) If I change all double's into float's in the code above it seems to

> I think you are looking at the scalar epilogue. The number of iterations is
> unknown, so we need an epilogue loop for the case that number of iterations is
> not a multiple of 4.

Yes you're right. Sorry about that, my mistake.


> > (1) In this case it should work without -funsafe-math-optimizations but
> >     it doesn't. gcc 4.7 requires -fno-signed-zeros -fno-trapping-math
> >    -fassociative-math to make it work.
> > 
> 
> It's reduction, when we vectorize we change the order of computation. In order
> to be able to do that for floating point we need flag_associative_math.

In some cases it might be necessary but not here:

 sum1+=a;
 sum2+=a;

gives exactly the same result as

 (sum1, sum2) += (a, a);

Lets take a more applied example, say calculating the sum of 1/i:

   double harmon(int n) {
      double sum=0.0;
      for(int i=1; i<n; i++){
         sum += 1.0/i;
      }
      return sum;
   }

This requires reordering of the sum to be vectorized, so in this case
I agree we need -funsafe-math-optimizations.
However, one could manually split the sum 

   double harmon(int n) {
      assert(n%2==0);
      double sum1=0.0, sum2=0.0;
      for(int i=1; i<n; i+=2){
         sum1 += 1.0/i;
         sum2 += 1.0/(i+1);
      }
      return sum1+sum2;
   }

and now I'd expect the compiler to vectorize this without
-funsafe-math-optimizations as it doesn't change any computational
results:

         (sum1, sum2) += (1.0/i, 1.0/(i+1));

I can attach a test case with that example if that'd be useful?

Reply via email to