On Sat, Nov 23, 2013 at 5:48 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>> Thanks for lookin at this. I am a real newcomer to 387, and it took me a >> long time perusing the Intel doc, as well as glibc sources, to come up with >> that. The “reference” implementation of these FPU functions, the one I am >> confident in, is that in config/fpu-glibc.h: i.e., the functions in >> config/fpu-i387.h should have the same effect that config/fpu-glibc.h on >> i386/x86_64 hardware. >> >> I’ll reply to your comments, but in some cases I was not sure exactly what >> you were saying… thanks for your help, and patience! >> >> >>> @@ -136,16 +165,54 @@ set_fpu (void) >>> __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (cw_sse)); >>> >>> /* The SSE exception masks are shifted by 7 bits. */ >>> - cw_sse |= _FPU_MASK_ALL << 7; >>> - cw_sse &= ~(excepts << 7); >>> - >>> - /* Clear stalled exception flags. */ >>> - cw_sse &= ~_FPU_EX_ALL; >>> >>> You have to clear stalled SSE exceptions here. Their flags are in LSB >>> bits, so their position is different than the position of exception >>> mask bits in the control word. >> >> So, if I get you right, I should restore the "cw_sse &= ~_FPU_EX_ALL”, which >> I had mistakenly removed. >> But I’m looking at glibc-2.18/sysdeps/x86_64/fpu/feenablxcpt.c and >> fedisblxcpt.c, and it doesn’t seem to be done there. > > The idea was that since control word is changed, status word should be > cleared. But since stalled flags won't raise an exception, it actually > doesn't matter, although it looks nicer in a debugger. However, if you > remove SSE clear, you should also remove fnclex from x87 code. Actually, I was wrong. You have to clear stalled flags. Please consider following test: --cut here-- #include <fenv.h> int main(void) { unsigned short cw; /* Raise FE_DIVBYZERO */ volatile float d = 0.0f; volatile float r = 1.0f / d; #if GLIBC feenableexcept (FE_DIVBYZERO); #else __asm__ __volatile__ ("fstcw\t%0" : "=m" (cw)); cw |= FE_ALL_EXCEPT; cw &= ~FE_DIVBYZERO; __asm__ __volatile__ ("fnclex\n\tfldcw\t%0" : : "m" (cw)); #endif /* Raise FE_INEXACT */ d = 3.0f; r = 1.0f / d; return 0; } --cut here-- The test (compiled with -m32 or -mfpmath=387 due to x87 assembly, linked against -lm will generate erroneous exception when -DGLIBC is added to compile flags. So, it looks to me that glibc has a bug here. Oh, and contrary to claims in glibc sources, the above test raises only FE_INEXACT exception. I have added Joseph to Cc due to glibc issues. Uros.