Hey folks. I'm having some trouble with a process compiled with gcc 3.3.6. This code is pretty complex and has several features that are not typically in use because they involve non-production test cases.
The problem is I'm getting core dumps (SEGV) that appears to come from this code when I know it shouldn't be in the execution path. The code in question is switched on by a command line argument only, and the process is managed by a parent process that monitors and manages it's execution - reporting crashes and restarting it if necessary. Here's my environment: gcc 3.3.6 built on SunOS 5.8 sun4u sparc SUNW,Ultra-60, app built on the same platform and execution on SunOS 5.8 sun4u sparc SUNW,UltraSPARC-IIi-cEngine. The entire codebase is written in C, and is compiled as follows: /usr/local/gcc-3.3.6/bin/gcc -ggdb -g3 -Wall -D_REENTRANT -Wno-multichar -Wno-unused-function -D_SOLARIS -DUSE_DEV_POLL -mcpu=ultrasparc -O2 -DTIMING=1 -DDB_TIMING=1 -Icommon/include -I/opt/oracle/8.1.7/include -I/opt/oracle/8.1.7/rdbms/public -c -o store.o store.c These problems have popped up time and again over the last 6 years, going as far back as gcc 2.95, but gdb has never been able to tell me any more than where the problem came from (the Solaris pstack utility always agrees with gdb). These problems are only repeated under longer execution times, and only after some thousands or even millions of transactions. The application is supposed to provide 99.97% availability, so having this happen 12 times over the course of a weekend is a bit concerning. Sometimes a build will prove wonderfully stable, but then a very small code change made to tweak some behavior will completely destabilize it. Recently, I added a handler to catch segfaults and bus errors to try to extract more info through the ucontext interface. I am able to get a little explicit detail, but not much new information. Problem with this is it doesn't preserve the originating stack as well. At this point, I'm at a loss as to where to start. This is a pretty important codebase (to my employer, anyway) and the frequency of these inexplicable problems is starting to cause some concern. Any suggestions as to where to go next? If I've forgotten any potentially useful information please don't hesitate to request it. Please CC me directly, as I am not on the dev list. Thanks for your time. Lou -- Louis LeBlanc [EMAIL PROTECTED] Fully Funded Hobbyist, KeySlapper Extrordinaire :þ http://www.keyslapper.net Ô¿Ô¬ Key fingerprint = C5E7 4762 F071 CE3B ED51 4FB8 AF85 A2FE 80C8 D9A2 Flugg's Law: When you need to knock on wood is when you realize that the world is composed of vinyl, naugahyde and aluminum.