[Bug target/30255] New: register spills in x87 unit need to be 80-bit, not 64

whaley at cs dot utsa dot edu Mon, 18 Dec 2006 12:08:42 -0800

Hi,

I am aware that gcc attempts to avoid any reordering of floating-piont
operations by default, as this leads to slightly different answers on different
runs.  There appears to be a similar problem on the x87, where from my
assembly-diving, I believe I've established that when a register spill is
required, gcc only stores to the precision of the computation (eg., 64 bits for
double precision).  On the x87 unit, this therefore introduces an unpredictable
(in the sense that the source does not have a store with its implicit round,
but the executable does) round operation in the middle of the computation. 
This unasked-for round operation has the exact same affect as reordering two fp
computations (eg, it introduces an epsilon error).  This means that not only do
you have differing answers where you don't expect them, but theoretically, the
80-bit x87 could produce less accurate results than true 32 or 64-bit (though
this would almost never happen in practice, as it would require massive
spilling).


It came to my attention because a user of my ATLAS library noted that ATLAS
failed to produce a true symmetric matrix when C = A * transpose(A) was taken. 
If there is no reorderings, the lower triangle of C should exactly match the
upper triangle.  When using gcc 4.2.0 20060807 (experimental) a register spill
is introduced in the calculation of a 4x1 sub-block of C.  The spill only
affects the C[0], and that element gets an additional round that other elements
do not, leading to a slightly non-symmetric matrix.

Note that this is not stores in the algorithm causing rounding (which is
inevitable), but stores unpredictably introduced into the algorithm by gcc.

A complete fix for this problem is to always do 80-bit register spills for the
x87, regardless of the data type of the final calculation, and thus avoid the
unpredictable round steps.

In order to get the problem, you need a code that has a spill, and depends on
getting the same answer to one spilled and one unspilled redundant calculation.
 I have a test case that does so for the above experimental gcc, but not for
gcc 4.1.1 20060525 (Red Hat 4.1.1-1), since this earlier one doesn't inject a
spill in the right place.  I have not tried on various other compiler versions,
because I figure this is a general policy, and if I have figured the problem
right, you can confirm easily how many bits you spill from the x87.

If you are interested in making the x87 produce the same answer in this case,
and it is helpful, I can certainly post my tester that demonstrates the
problem.  I don't want to go through the trouble if the answer is either
"confirmed, not going to fixed", or "confirmed, see how it would cause the
error, will fix".

Let me know,
Clint


-- 
           Summary: register spills in x87 unit need to be 80-bit, not 64
           Product: gcc
           Version: 4.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: whaley at cs dot utsa dot edu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255

[Bug target/30255] New: register spills in x87 unit need to be 80-bit, not 64

Reply via email to