I originally posted this on the java list, but the suggestion was made that I post it here to see if somebody can help.
My previous related posts to the java list (I was originally thinking I had two separate problems): http://gcc.gnu.org/ml/java/2005-11/msg00230.html http://gcc.gnu.org/ml/java/2005-11/msg00229.html I wonder if someone reasonably familiar with the unwinder can have a look at the hacks documented below, and tell me whether they indicate a bug, or conversely whether they provide a clue as to why the interpreter and static-built executables aren't working for me, even though other people seem to have no problem. Thanks. I now have a simple program like the following one working as static binaries with recent gcc (this program crashed without my hacks). public class Nothing { public static void main (String[] args) { } } More complex programs also now work, including AWT with xlib, and an example that throws and catches an Exception (demonstrating, I think, that the unwinder sometimes works in my environment). The GIJ interpreter (which is of course NOT statically linked) still aborts, and since I don't get any symbols in its backtrace, I'm not sure how to pursue that. My programs work fine when built as dynamic executables even without the hacks. The interpreter works fine with the SJLJ unwinder (without the hacks), but static executables don't. I'm using this software: gij (GNU libgcj) version 4.1.0 20051116 (experimental) (revision 107090 from SVN, checked out 2005-11-16) gcj (GCC) 4.1.0 20051116 (experimental) (same sources) Linux version 2.6.8.1-10mdk (cpu is Pentium 4) binutils-2.15.90.0.3-1mdk glibc-2.3.3-21mdk My original gcc configuration, which works fine with gcc-4_0-branch from July ../gcc/configure --prefix=/var/local/gcc/tip_20051115 --mandir=/var/local/gcc/man --infodir=/var/local/gcc/info --enable-shared --enable-threads=posix --disable-checking --host=i386-redhat-linux --enable-java-awt=xlib --enable-libgcj --enable-languages=c,c++,java --with-system-zlib --enable-__cxa_atexit For my SJLJ unwinder tests, I added --enable-sjlj-exceptions Here are the hacks I did to get the static builds working. Do they offer any clues? HACK #1: The first hack addresses this crash: #9020 <signal handler called> #9021 0x080921db in _Unwind_IteratePhdrCallback (info=0xbffff040, size=32, ptr=0xbffff094) at ../../gcc/gcc/unwind-dw2-fde-glibc.c:262 Index: unwind-dw2-fde-glibc.c =================================================================== --- unwind-dw2-fde-glibc.c (revision 107090) +++ unwind-dw2-fde-glibc.c (working copy) @@ -257,7 +257,7 @@ _Unwind_IteratePhdrCallback (struct dl_p if (size >= sizeof (struct ext_dl_phdr_info)) { - if (last_cache_entry != NULL) + if (prev_cache_entry != NULL && last_cache_entry != NULL) { prev_cache_entry->link = last_cache_entry->link; last_cache_entry->link = frame_hdr_cache_head; [end of hack diff] So prev_cache_entry is obviously null sometimes, presumably because this thing in the same file "found an unused entry": last_cache_entry = cache_entry; /* Exit early if we found an unused entry. */ if ((cache_entry->pc_low | cache_entry->pc_high) == 0) break; if (cache_entry->link != NULL) prev_cache_entry = cache_entry; Looking at the changes to unwind-dw2-fde-glibc.c, I see that the parts of the code I've shown here were structured differently in the 4.0 branch (which works just fine for me with static builds). Maybe that's a clue. HACK #2: The first hack got the unwinder to return to _Jv_Throw rather than crashing, so I was able to get an error code: _URC_END_OF_STACK. That leads to an abort in _Jv_Throw. I did another backtrace (that's about all I can do - gdb hangs up running these executables for some reason, so I'm stuck with core dumps). #0 0x08325081 in kill () #1 0x0830d1aa in __pthread_raise () #2 0x08325368 in abort () #3 0x0809dc9d in _Jv_Throw (value=0x4d828) at ../../../gcc/libjava/exception.cc:114 #4 0x08093537 in catch_segv (_dummy=Could not find the frame base for "catch_segv". ) at ../../../gcc/libjava/prims.cc:152 #5 <signal handler called> #6 0x080b6a65 in _Jv_FreeMethodCache () at ../../../gcc/libjava/java/lang/natClass.cc:941 #7 0x080bd279 in java::lang::Thread::finish_ (this=0x61f18) at ../../../gcc/libjava/java/lang/natThread.cc:219 #8 0x080943a2 in _Jv_RunMain (vm_args=0x0, klass=0x851d660, name=0x0, argc=1, argv=0xbffff8a4, is_jar=false) at ../../../gcc/libjava/prims.cc:1386 #9 0x080944f8 in _Jv_RunMain (klass=0x851d660, name=0x0, argc=1, argv=0xbffff8a4, is_jar=false) at ../../../gcc/libjava/prims.cc:1397 #10 0x0809452b in JvRunMain (klass=0x851d660, argc=1, argv=0xbffff8a4) at ../../../gcc/libjava/prims.cc:1403 #11 0x08048235 in main () Strange: natClass.cc:941 is just "if (method_cache != NULL)", which shouldn't be able to cause a segv. I'm guessing the error is actually in _Jv_Free. So I just commented out all the innards of the _Jv_FreeMethodCache function, and now my simple test cases work. Like so: void _Jv_FreeMethodCache () {/* #ifdef HAVE_TLS if (method_cache != NULL) { _Jv_Free(method_cache); method_cache = NULL; } #endif // HAVE_TLS */} I'm sure this function is important, and removing it is a bad idea, but it prevents the abort in the static build case. That let me confirm that no more segv's happen after this one - the programs now terminate normally. Again, I'm wondering whether this is a clue as to what's going on. Any insights would be greatly appreciated.