4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3

paolo dot bonzini at lu dot unisi dot ch Thu, 10 Aug 2006 07:29:06 -0700


------- Comment #61 from paolo dot bonzini at lu dot unisi dot ch  2006-08-10 
14:28 -------
Subject: Re:  [4.0/4.1 Regression] gcc 4 produces worse
 x87 code on all platforms than gcc 3



> Making vectorization depend on a flag that says it is allowed to violate IEEE
> is therefore a killer for me (and most knowledgable fp guys).  This is ironic,
> since vectorization of sums (as in GEMM) is usually implemented as scalar
> expansion on the accumulators
>   
In case of GCC, it performs the transformation that Dorit explained.  It 
may not produce an IEEE-compliant answer if there are zeros and you 
expect to see a particular sign for the zero.
> and this not only produces an IEEE-compliant answer
>   
The IEEE standard mandates particular rules for performing operations on 
infinities, NaNs, signed zeros, denormals, ...  The C standard, by 
mandating no reassociation, ensures that you don't mess with NaNs, 
infinities, and signed zeros.  As soon as you perform reassociation, 
there is *no way* you can be sure that you get IEEE-compliant math.

  +Inf + (1 / +0) = Inf, +Inf + (1 / -0) = NaN.
> but it is *more* accurate for almost all data.
http://citeseer.ist.psu.edu/589698.html is an example of a paper that 
shows FP code that avoids accuracy problems.  Any kind of reassociation 
will break that code, and lower its accuracy.  That's why reassociation 
is an "unsafe" math optimization.

If you want a -freassociate-fp math, open an enhancement PR and somebody 
might be more than happy to separate reassociation from the other 
effects of -funsafe-math-optimizations.

(Independent of this, you should also open a separate PR for ATLAS 
vectorization, because that would not be a regression and would not be 
on x87) :-)

Paolo


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827

[Bug target/27827] [4.0/4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3

Reply via email to