On Wed, Aug 04, 2010 at 11:11:24PM +0200, Mark Kettenis wrote: > > Date: Tue, 3 Aug 2010 21:16:46 -0700 > > From: patrick keshishian <sids...@boxsoft.com> > > > > On Tue, Aug 03, 2010 at 09:39:19PM -0400, Brad wrote: > > > On Tuesday 03 August 2010 21:34:53 patrick keshishian wrote: > > > > On Tue, Aug 03, 2010 at 11:05:16AM +0200, Mark Kettenis wrote: > > > > > > X-Authentication-Warning: roppongi.boxsoft.com: sidster set sender > > > > > > to > > > > > > sids...@boxsoft.com using -f Date: Tue, 3 Aug 2010 01:48:21 -0700 > > > > > > From: patrick keshishian <sids...@boxsoft.com> > > > > > > > > > > > > Greetings, > > > > > > > > > > > > I'm moving this discussion [1] from misc@ over here as > > > > > > suggested by a openbsd developer. > > > > > > > > > > > > Summary: Noticed with a few snapshots that Xorg kept > > > > > > crashing. Per Matthieu Herrb I built xenocara with > > > > > > debug to get better info, but that made Xorg not crash. > > > > > > Next building firefox-3.6.8 from ports kept crashing. > > > > > > > > > > > > I wrote a small program that called dlopen(3)/dlsym(3) > > > > > > and it would exit with status code 20. Same sources > > > > > > compiled on a 4.7 i386 ran as expected. Matthieu pointed > > > > > > out that the problem was identified and fixed in a recent > > > > > > commit [2]. > > > > > > > > > > > > I rebuild kernel and userland after an update (picking > > > > > > aforementioned commit). This solved neither ffox's nor > > > > > > my test app's issue. > > > > > > > > > > The fix is in the compiler. Since ld.so gets built before the > > > > > compiler, you'll actually have to rebuild userland twice for it to > > > > > pick up the fix. > > > > > > > > Rebuild of userland for the second time results in the same > > > > behavior: Random successes amongst many failures (=status 20 > > > > exits). > > > > > > > > Anything else I should try? > > > > --patrick > > > > > > Try upgrading via a snapshot and go from there. > > > > Same result. > > Well, all regression tests in the tree succeed. Can you show us your > test application that reveals the problem?
Sorry for not getting back to this earlier; was out of reach of a computer. This last one is my fuckup. My test app works with the snapshot from the 3rd. I'm rebuilding firefox ATM and should know for sure in 8 or so hours. --patrick For reference. // so_loader.c #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <dlfcn.h> #include <err.h> #include <errno.h> #include <signal.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include <unistd.h> int (*foo)(char const *); #define FOOSTR "This is a dlopen test program" #define SOFILE "./so_cfncs.so" int main(int argc, char *argv[]) { void *dh; dh = dlopen(SOFILE, RTLD_NOW); fprintf(stderr, "dlopen(%s, RTLD_NOW) ret:%p\n", SOFILE, dh); if (NULL == dh) errx(1, "dlopen(%s): %s", SOFILE, dlerror()); foo = dlsym(dh, "foo_strlen"); if (NULL == foo) { fprintf(stderr, "Failed looking up foo_strlen\n"); goto out; } printf("foo_strlen(%s) ret:%d\n", FOOSTR, foo(FOOSTR)); out: dlclose(dh); exit(0); } // so_cfncs.c #include <sys/types.h> #include <string.h> int foo_strlen(char const *s) { return strlen(s); }