Hello! On Tue, Nov 30, 2010 at 12:07:05PM +0100, I wrote: > On Fri, Nov 26, 2010 at 01:22:05AM +0100, I wrote: > > Should refs have simply been initialized to zero (as the zero value is > > noneffective, and we'll set the ss->thread, etc. values later on)? > > At the moment, I don't have the time to analyze this further, but I'll > simply give this glibc code change a try, and re-run the GCC testsuite > afterwards.
Hung again; at another position (understandably), but with the same symptoms as before. (I'm assuming that `fork' isn't linked into some relevant binary statically, but as GDB shows the shared library's version, I think I'm fine.) Here is a program to make this same thing happen in 30 minutes instead of the testsuite's one or two days. $ ./fork_forever [...] 1817: 33 1818: 37 1819: 34 1820: 35 1821: 36 1822: 38 1823: 35 1824: 36 1825: 34 1826: 37 [hangs] The GDB backtrace looks very much like the one on <http://www.bddebian.com/~hurd-web/open_issues/fork_mach_port_mod_refs_ekern_urefs_owerflow/>. Oh, and interesting piece of maths: 1826 * 35.5 (roughly) = 65536. So I guess that we're ``simply'' leaking something with every fork call... I'll try to find some time to go hunting. Regards, Thomas
#include <unistd.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <sys/wait.h> int main(int argc, char * argv[]) { pid_t child, pid; int status; time_t starttime, lasttime, nowtime; unsigned long int n_forks = 0; starttime = lasttime = time (NULL); while (1) { child = fork(); #define CHILD_EXIT 42 if (child == -1) { perror("fork"); exit(EXIT_FAILURE); } if (child == 0) _exit(CHILD_EXIT); pid = waitpid(child, &status, 0); if (pid == -1 || !WIFEXITED(status) || WEXITSTATUS(status) != CHILD_EXIT) { perror("waitpid"); exit(EXIT_FAILURE); } n_forks++; nowtime = time (NULL); if (lasttime != nowtime) { printf ("%u: %lu\n", (unsigned int) (nowtime - starttime), n_forks); n_forks = 0; lasttime = nowtime; } } return 0; }
signature.asc
Description: Digital signature