Hello! On Tue, Nov 30, 2010 at 12:07:05PM +0100, I wrote: > On Fri, Nov 26, 2010 at 01:22:05AM +0100, I wrote: > > Should refs have simply been initialized to zero (as the zero value is > > noneffective, and we'll set the ss->thread, etc. values later on)? > > At the moment, I don't have the time to analyze this further, but I'll > simply give this glibc code change a try, and re-run the GCC testsuite > afterwards.
Hung again; at another position (understandably), but with the same
symptoms as before. (I'm assuming that `fork' isn't linked into some
relevant binary statically, but as GDB shows the shared library's
version, I think I'm fine.)
Here is a program to make this same thing happen in 30 minutes instead of
the testsuite's one or two days.
$ ./fork_forever
[...]
1817: 33
1818: 37
1819: 34
1820: 35
1821: 36
1822: 38
1823: 35
1824: 36
1825: 34
1826: 37
[hangs]
The GDB backtrace looks very much like the one on
<http://www.bddebian.com/~hurd-web/open_issues/fork_mach_port_mod_refs_ekern_urefs_owerflow/>.
Oh, and interesting piece of maths: 1826 * 35.5 (roughly) = 65536. So I
guess that we're ``simply'' leaking something with every fork call...
I'll try to find some time to go hunting.
Regards,
Thomas
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/wait.h>
int
main(int argc, char * argv[])
{
pid_t child, pid;
int status;
time_t starttime, lasttime, nowtime;
unsigned long int n_forks = 0;
starttime = lasttime = time (NULL);
while (1)
{
child = fork();
#define CHILD_EXIT 42
if (child == -1)
{
perror("fork");
exit(EXIT_FAILURE);
}
if (child == 0)
_exit(CHILD_EXIT);
pid = waitpid(child, &status, 0);
if (pid == -1
|| !WIFEXITED(status)
|| WEXITSTATUS(status) != CHILD_EXIT)
{
perror("waitpid");
exit(EXIT_FAILURE);
}
n_forks++;
nowtime = time (NULL);
if (lasttime != nowtime)
{
printf ("%u: %lu\n", (unsigned int) (nowtime - starttime), n_forks);
n_forks = 0;
lasttime = nowtime;
}
}
return 0;
}
signature.asc
Description: Digital signature
