4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3

whaley at cs dot utsa dot edu Thu, 10 Aug 2006 07:08:37 -0700


------- Comment #60 from whaley at cs dot utsa dot edu  2006-08-10 14:08 -------
Paolo,


Thanks for the explanation of what -funsafe is presently doing.

>You are also confusing -funsafe-math-optimizations with -ffast-math.

No, what I'm doing is reading the man page (the closest thing to a contract
between gcc and me on what it is doing with my code):
|      -funsafe-math-optimizations
|          Allow optimizations for floating-point arithmetic that (a) assume
|          that arguments and results are valid and (b) may violate IEEE or
|          ANSI standards.

The (b) in this statement prevents me, as a library provider that *must* be
able to reassure my users that I have done nothing to violate IEEE fp standard
(don't get me wrong, there's plenty of violations of the standard that occur in
hardware, but typically in well-understood ways by the scientists of those
platforms, and in the less important parts of the standard), from using this
flag.  I can't even use it after verifying that no optimization has hurt the
present code, because an optimization that violates IEEE could be added at a
later date, or used on a system that I'm not testing on (eg., on some systems,
could cause 3DNow! vectorization).

>Rules are determined by the language standards.  I believe that C
>mandates no reassociation; Fortran allows reassociation unless explicit
>parentheses are present in the source, but this is not (yet) implemented
>by GCC.

My precise point.  There are *lots* of C rules that a fp guy could give a crap
about (for certain types of fp kernels), but IEEE is pretty much inviolate. 
Since this flag conflates language violations (don't care) with IEEE
(catastrophic) I can't use it.  I cannot stress enough just how important IEEE
is: it is the only contract that tells us what it means to do a flop, and gives
us any way of understanding what our answer will be.

Making vectorization depend on a flag that says it is allowed to violate IEEE
is therefore a killer for me (and most knowledgable fp guys).  This is ironic,
since vectorization of sums (as in GEMM) is usually implemented as scalar
expansion on the accumulators, and this not only produces an IEEE-compliant
answer, but it is *more* accurate for almost all data.

Thanks,
Clint


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827

[Bug target/27827] [4.0/4.1 Regression] gcc 4 produces worse x87 code on all platforms than gcc 3

Reply via email to