On Tue, 2007-02-27 at 13:54 -0500, Justin Pryzby wrote: > On Tue, Feb 27, 2007 at 02:52:40PM +0100, Label Sarl wrote: > > Hi, > > > > I experiment a futex freeze too in a self made program. > How did you find this bug? :) Very simply, the program that freeze is a TCP server. Whenever that problem occurs, the service is out of order ... Even the load-balancer program that wraps on instances of this faulty server cannot detect the freeze (because of the backlog argument of listen(2)). > > Do you mind if I bounce your message there? Please do it. > > > 1) I have a un-freezing technic that uses the ptrace(2) to : > > - attach a frozen process, > > - get the syscall number (checks that it is a sys_futex), > > - get the arguments of sys_futex(2) to check it is a WAIT > > and get the adress of the wait. > > - poke a 0 at the adress. > > That all unfreezes the process. > > Do you want the code of my survfutex program ? > Sure, this would be interesting. Here it is an attach file survfutex.c Simply compiled (on RedHat FC4) with ; cc survfutex.c -o survfutex
> > > 2) Now, I am tracking the cause and have some informations. > > It appears to occur (in my code) during the execution of a > > signal handler set for the SIGCLD. > > When the father catches the SIGCLD, it performs some action > > like fprintf(3) and free(3). > Did you know that only some function are considered safe for signal > handlers? See signal(2). free and printf are [implicitly] considered > unsafe, which (I think) can explain your problem. Yes, events prooved it, even if it was not so years ago ... (I am working with Linux since 1994 and more generaly on Unix system since 1983, but I know "time goes and OS design changes ...") But, I still think that I am not alone to experiment such an implementation bug. > > > Of course, it does not occur systematicaly (it would have been too > > simple). > > When I suppress fprintf() and free() from the signal handler, the > > father process never freezes. > > My supposition is that the sys_futex() is called from glibc (may be > > from brk(2) and some other syscall). > > I have two kinds of bug work-arround : > > The first consist in : > > - blocking the SIGCLD, > > - At some part of the code watch if there is a pending SIGCLD > > if so, performs the wait(2) and other action out of a signal > > handler. Then remove the sig pending condition. > > The second consist in : > > - performing the wait(2) from the signal handler and move the > > struct from a list of "alive son" to a list of dead ones. > > This is done with a simple linked list technic. > > - From outside the signal handler, see if some son are dead. > > If so, performs the rest of actions (fprint() to log the death > > with the exit status, free() the struct memorizing the son). > Both of these seem like accepted was of handling signals. Yes, that is the point. I was just asking since when the prohibition of "unsafe" functions has been declared ? I surely missed something ... I am also asking if those functions are unsafe because of themselves or because of called syscall (write(2), brk(2), ...) > > > One last thing, I suppose the bug is a general bug that does not freeze > > only xmms. > I wonder if xmms (and/or firefox, which has also done this to me) have a > nonsafe signal hander too? That can be it too. Are you still experienting such freeze ? If so, you may try the "survfutex -p {pid}" to check it and then "survfutex -p {pid] -u" to try unfreezing the process. Let me know if it works for you ... I suspect also some race condition due to signal blocked at a sad moment (contingency of fork(), exit() and memory locking-unlocking (SHM ?)) Unfortunately, I am not specializing in using of sys_futex and so cannot proove it. But I am interessed in further bug-tracking advancement. Fell free to communicate to anyone that can help us to solve. Cheers -Rogers > > Cheers > Justin >
/* * AUTHOR: Rogers VEBER. * Survey of process that may freeze in a sys_futex(2) WAIT. * * MAIN Actions that can be performed : * simply type survfutex -p {pid} * Will check the process and tell if it is frozen by sys_futex(). * Add -u option to unlock the process. * Again, it works perfectly on my programs by POKING the 0 at * the wait_on address. * Add -s milisec to define a period between two checks. */ #include <stdio.h> #include <stdlib.h> #include <stdarg.h> #include <errno.h> #include <time.h> #include <sys/types.h> #include <sys/time.h> #include <sys/resource.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/user.h> #include <sys/syscall.h> #include <linux/futex.h> #define URegs(r) &(((struct user*)0)->regs.r) static pid_t gpid = (pid_t)0; static int log_lvl = 0; #define log_stream stdout static void trace(int lvl,const char *msg,...) { char dh[32]; va_list args; time_t ct; if( lvl < 0 ) { if( log_lvl >= lvl ) { return; } } else { if( log_lvl < lvl ) { return; } } time(&ct); strftime(dh,sizeof(dh),"%Y/%m/%d %T",localtime(&ct) ); fprintf(log_stream,"%s ",dh); va_start(args,msg); vfprintf(log_stream,msg,args); va_end(args); fflush(log_stream); } static void Usage(char *cmd) { fprintf(stdout,"Usage: %s options \n",cmd); fprintf(stderr,"where options may be :\n"); fprintf(stderr," -p pid to specify the PID to trace (default is PPID)\n"); fprintf(stderr," -l log_lvl to specify log level (default 0)\n"); fprintf(stderr," -d to detach process from father\n"); fprintf(stderr," -u to unblock process using POKEDATA\n"); fprintf(stderr," -w msec to wait msec millisecs before processing\n"); fprintf(stderr," -s msec to wait msec millisecs between each test\n"); } static int synchropid(int pid) { int cpid,status; status = 0; while( (cpid=wait4(-1,&status,WUNTRACED,(struct rusage*)0)) != pid ) { if( cpid < 0 ) { return -1; } } return 0; } static void sleepmsec(int msec) { struct timespec req,rem; if( msec <= 0 ) { return; } req.tv_sec = msec / 1000; req.tv_nsec = msec % 1000; trace(3,"Waiting %d.%03d s ms\n",req.tv_sec,req.tv_nsec); req.tv_nsec *= 1000; while( nanosleep(&req,&rem) < 0 ) { if( errno == EINTR ) { req = rem; trace(3,"Interrupted: still wait %d.%03d s ms\n", req.tv_sec,req.tv_nsec); continue; } fprintf(stderr,"nanosleep(): errno %d\n",errno); break; } trace(3,"wait done !\n"); } static int action(int pid,int ublock) { int l; ulong sc_nr,sc_av[8]; // Recup du numero d'appel system dans orig_eax. sc_nr = ptrace(PTRACE_PEEKUSER,pid,URegs(orig_eax),0); if( sc_nr != SYS_futex ) { trace(2,"Syscall number 0x%x\n",sc_nr); return 2; } trace(0,"Syscall number 0x%x FUTEX !\n",sc_nr); sc_av[1] = ptrace(PTRACE_PEEKUSER,pid,URegs(ecx),0); if( sc_av[1] != FUTEX_WAIT ) { return 2; } /* * To unblock the process, just get the first arg of the futex call * and write 0 to it using ptrace(POKEDATA). */ if( ublock != 0 ) { sc_av[0] = ptrace(PTRACE_PEEKUSER,pid,URegs(ebx),0); l = ptrace(PTRACE_PEEKDATA,pid,sc_av[0],0); trace(1,"Poking DATA 0x%x (%x)\n",sc_av[0],l); l = ptrace(PTRACE_POKEDATA,pid,sc_av[0],0); trace(1,"Poking done %d\n",l); trace(-1,"Poking DATA 0 at 0x%x done st %d\n",sc_av[0],l); } return 0; } static int doCheck(int pid,int ublock) { int r; if( ptrace(PTRACE_ATTACH,pid,0,0) < 0 ) { if( errno == ESRCH ) { fprintf(stderr,"Pid %d not here\n",pid); return -1; } fprintf(stderr,"PTRACE_ATTACH pid %d error <%d>\n",pid,errno); return -1; } synchropid(pid); trace(1,"attached to pid %d!\n",pid); if( ptrace(PTRACE_SYSCALL,pid,0,0) < 0 ) { fprintf(stderr,"PTRACE_SYSCALL pid %d error <%d>\n",pid,errno); r = 1; } else { synchropid(pid); r = action(pid,ublock); } if( ptrace(PTRACE_DETACH,pid,0,0) < 0 ) { fprintf(stderr,"PTRACE_DETACH pid %d error <%d>\n",pid,errno); return -1; } else { trace(1,"detached from pid %d!\n",pid); } return r; } static void End(int sig) { trace(0,"Terminated with sig %d\n",sig); if( gpid != (pid_t)0 ) { ptrace(PTRACE_DETACH,gpid,0,0); } exit(0); } int main(int ac,char *av[]) { int i,r; int pid=0, detach=0, ublock=0, waitmsec=0, loopmsec = 0; for(i=1;i<ac;++i) { if( av[i][0] == '-' ) { switch( av[i][1] ) { case 'h': Usage(av[0]); exit(0); case 'u': ublock = 1; break; case 'd': detach = 1; break; case 'p': if( ++i >= ac ) { fprintf(stderr, "option -p suppose a pid as argument\n"); exit(1); } pid = atoi(av[i]); if( pid <= 0 ) { fprintf(stderr,"Illegal pid \"%s\"\n", av[i]); exit(1); } break; case 'l': if( ++i >= ac ) { fprintf(stderr, "option -l suppose a loglevel as argument\n"); exit(1); } log_lvl = atoi(av[i]); if( log_lvl < 0 ) { fprintf(stderr, "Illegal loglevel \"%s\"\n", av[i]); exit(1); } break; case 's': if( ++i >= ac ) { fprintf(stderr, "option -s suppose a msec as argument\n"); exit(1); } loopmsec = atoi(av[i]); if( loopmsec <= 0 ) { fprintf(stderr, "Illegal loopmsec \"%s\"\n", av[i]); exit(1); } break; case 'w': if( ++i >= ac ) { fprintf(stderr, "option -w suppose a msec as argument\n"); exit(1); } waitmsec = atoi(av[i]); if( waitmsec <= 0 ) { fprintf(stderr, "Illegal waitmsec \"%s\"\n", av[i]); exit(1); } break; default: fprintf(stderr,"Illegal option \"%s\"\n",av[i]); exit(1); } } } if( pid <= 0 ) { pid = getppid(); } trace(2,"chkfutex-%d (chck pid %d) detach %d ublock %d waitmsec %d\n", getpid(),pid,detach,ublock,waitmsec); if( detach != 0 ) // Detach from father process. { trace(2,"Detaching from father\n"); if( fork() != 0 ) { exit(0); } trace(2,"operating from son (pid %d)\n",getpid()); } if( waitmsec > 0 ) { sleepmsec(waitmsec); } signal(SIGINT,End); signal(SIGTERM,End); signal(SIGHUP,End); for(;;) { gpid = pid; r = doCheck(pid,ublock); gpid = (pid_t)0; if( r < 0 || loopmsec <= 0 ) { break; } sleepmsec(loopmsec); } exit(r); }