------- Comment #58 from whaley at cs dot utsa dot edu 2006-08-09 23:01 ------- Andrew,
>Except for the fact IEEE compliant fp does not allow for reordering at all >except >in some small cases. For an example is (a + b) + (-a) is not the same as (a + >(-a)) + b, >so reordering will invalid IEEE fp for larger a and small b. Yes maybe we >should split out >the option for unsafe math fp op for reordering but that is different issue. Thanks for the response, but I believe you are conflating two issues (as is this flag, which is why this is bad news). Different answers to the question "what is this sum" does not ruin IEEE compliance. I am referring to IEEE 754, which is a standard set of rules for storage and arithmetic for floating point (fp) on modern hardware. I am unaware of their being any rules on compilation. I.e. whether re-orderings are allowed is beyond the standard. It rather is a set of rules that discusses for floating point operations (FLOPS) how rounding must be done, how overflow/underflow must be handled, etc. Perhaps there is another IEEE standard concerning compilation that you are referring to? Now of course, floating point arithmetic in general (and IEEE-compliant fp in specific) is not associative, so indeed (a+b+c) != (c+b+a). However, both sequences are valid answers to "what are these 3 things summed up", and both are IEEE compliant if each addition is compliant. What non-IEEE means is that the individual flops are no longer IEEE compliant. This means that overflow may not be handled, or exceptional conditions may cause unknown results (eg., divide by zero), and indeed we have no way at all of knowing what an fp add even means. An example of a non-IEEE optimization is using 3DNow! vectorization, because 3DNow! does not follow the IEEE standard (for instance, it handles overflow only by saturation, which violates the standard). SSE (unless you turn IEEE compliance off manually) is IEEE compliant, and this is why you see computational guys like myself using it, and not using 3DNow!. To a computational scientist, non-IEEE is catastophic, and "may change the answer" is not. "May change the answer" in this case simply means that I've got a different ordering, which is also a valid IEEE fp answer, and indeed may be a "better" answer than the original ordering (depending on the data; no way to know this w/o looking at the data). Non-IEEE means that I have no way of knowing what kind of rounding was done, how flop was done, if underflow (or gradual overflow!) occurred, etc. It is for this reason that optimizations which are non-IEEE are a killer for computational scientists, and reorders are no big deal. In the first you have no idea what has happened with the data, and in the second you have an IEEE-compliant answer, which has known properties. It has been my experience that most compiler people (and I have some experience there, as I got my PhD in compilation) are more concerned with integer work, and thus not experts on fp computation. I've done fp computational work for the majority of my research for the last decade, so I thought I might be able to provide useful input to bridge the camps, so to speak. In this case, I think that by lumping "cause different IEEE-compliant answers" in with "use non-IEEE arithmetic" you are preventing all serious fp users from utilizing the optimizations. Since vectorization is of great importance on modern machines, this is bad news. Obviously, I may be wrong in what I say, but if reordering makes something non-IEEE I'm going to have some students mad at me for teaching them the wrong stuff :) Has this made my point any clearer, or do you still think I am wrong? If I'm wrong, maybe you can point to the part of the IEEE standard that discusses orderings violating the standard (as opposed to the well-known fact that all implemented fp arithemetic is non-associative)? After you do this, I'll have to dig up my copy of the thing, which I don't think I've seen in the last 2 years (but I did scope some of books that cover it, and didn't find anything about compilation). Thanks, Clint -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827