I'm working with a rather large multi-threaded application that after a several hour run will often produce a floating point exception (specifically a divide by zero exception). I've manged to catch the signal and use some home grown code to examine the current state of the running thread as well as all of the others, but sadly this debug information never provides any valuable information. It appears as though the crash is happening in the pthread library when trying to execute a floating point store instruction (0x8114 listed below).
00008108 <__pthread_timedsuspend_new>: 8108: 7d 80 00 26 mfcr r12 810c: 94 21 fc 90 stwu r1,-880(r1) 8110: 7c 08 02 a6 mflr r0 8114: d9 c1 02 e0 stfd f14,736(r1) <--- SIGFPE happens here After looking over the 32-bit PPC Programming Env Manual it seems to me that the floating point operations that result in +/- infinity will disable the FPU and write the FPECR with the cause. Next time a floating point instruction is attempted we will get an exception since the FPU is disabled and then we will throw the exception. This information unfortunately tells me that I have to start digging through the source code to find the programming error, but I'm hoping I'm wrong about this. Any advice or tips? Thanks in advance, -andy
