Hi, Samuel - On Sun, Jan 06, 2008 at 03:13:46PM +0000, Samuel Thibault wrote: > Hello, > > I've dug a bit, since I've got an administration website which allows me > to reproduce the bug quite reliably. > > Benjamin A. Okopnik, le Mon 03 Jul 2006 11:26:34 -0400, a écrit : > > ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig -icanon -echo > > ...}) = 0 > > rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 > > pipe([5, 7]) = 0 > > pipe([9, 10]) = 0 > > clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, > > child_tidptr=0xb7bd0928) = 4518 > > --- SIGCHLD (Child exited) @ 0 (0) --- > > waitpid(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 127}], WNOHANG) = 4518 > > waitpid(-1, 0xbfa3e120, WNOHANG) = -1 ECHILD (No child processes) > > rt_sigaction(SIGCHLD, {0x804ba60, [], SA_RESTART}, {0x804ba60, [], > > SA_RESTART}, 8) = 0 > > sigreturn() = ? (mask now []) > > close(7) = 0 > > fcntl64(5, F_GETFL) = 0 (flags O_RDONLY) > > fstat64(5, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 > > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > > 0xb7f26000 > > _llseek(5, 0, 0xbfa3e33c, SEEK_CUR) = -1 ESPIPE (Illegal seek) > > close(9) = 0 > > fcntl64(10, F_GETFL) = 0x1 (flags O_WRONLY) > > fstat64(10, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 > > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = > > 0xb7f25000 > > _llseek(10, 0, 0xbfa3e33c, SEEK_CUR) = -1 ESPIPE (Illegal seek) > > write(10, "foo\n", 4) = -1 EPIPE (Broken pipe) > > --- SIGPIPE (Broken pipe) @ 0 (0) --- > > close(5) = 0 > > munmap(0xb7f26000, 4096) = 0 > > write(10, "foo\n", 4) = -1 EPIPE (Broken pipe) > > close(10) = 0 > > munmap(0xb7f25000, 4096) = 0 > > kill(4518, SIGKILL) = -1 ESRCH (No such process) > > rt_sigaction(SIGPIPE, {0x804c3c0, [], SA_RESTART}, {0x804c3c0, [], > > SA_RESTART}, 8) = 0 > > sigreturn() = ? (mask now []) > > --- SIGPIPE (Broken pipe) @ 0 (0) --- > > rt_sigaction(SIGPIPE, {0x804c3c0, [], SA_RESTART}, {0x804c3c0, [], > > SA_RESTART}, 8) = 0 > > sigreturn() = ? (mask now []) > > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > > +++ killed by SIGSEGV +++ > > Note that at this point the segfault happens in malloc called by putenv > (which itself is called by the / command). > > I've run this through gdb with handle SIGPIPE nopass, and then I > wouldn't get the segfault. Digging a bit in the SIGPIPE handler showed > me that it calls init_migemo(), which itself calls fclose(), which > is not safe since that function is not in the list of signal-safe > functions. I commented these fclose() calls, and now I can't reproduce > the bug any more. I'll keep that "fixed" version of w3m for some more > long-term testing, but I really think the problem is here: I guess that > fclose() frees something, so that it may corrupt the heap, thus the > segfault on the next malloc (which happens to be due to searching the > page). So the solution is probably to have the signal handler just set > a variable and move the call to init_migemo into the main stream of > instruction.
I'd wonder what's going to be left open as a result of those two "fclose()" calls not happening. Is there a signal-safe way of releasing those handles? I'd hate to see you create more problems by fixing this one. :) Regards, -- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *