------- Comment #61 from paolo dot bonzini at lu dot unisi dot ch 2006-08-10 14:28 ------- Subject: Re: [4.0/4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3
> Making vectorization depend on a flag that says it is allowed to violate IEEE > is therefore a killer for me (and most knowledgable fp guys). This is ironic, > since vectorization of sums (as in GEMM) is usually implemented as scalar > expansion on the accumulators > In case of GCC, it performs the transformation that Dorit explained. It may not produce an IEEE-compliant answer if there are zeros and you expect to see a particular sign for the zero. > and this not only produces an IEEE-compliant answer > The IEEE standard mandates particular rules for performing operations on infinities, NaNs, signed zeros, denormals, ... The C standard, by mandating no reassociation, ensures that you don't mess with NaNs, infinities, and signed zeros. As soon as you perform reassociation, there is *no way* you can be sure that you get IEEE-compliant math. +Inf + (1 / +0) = Inf, +Inf + (1 / -0) = NaN. > but it is *more* accurate for almost all data. http://citeseer.ist.psu.edu/589698.html is an example of a paper that shows FP code that avoids accuracy problems. Any kind of reassociation will break that code, and lower its accuracy. That's why reassociation is an "unsafe" math optimization. If you want a -freassociate-fp math, open an enhancement PR and somebody might be more than happy to separate reassociation from the other effects of -funsafe-math-optimizations. (Independent of this, you should also open a separate PR for ATLAS vectorization, because that would not be a regression and would not be on x87) :-) Paolo -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827