On Sunday 23 October 2005 15:45, Jason Stubbs wrote: > One question; is the waitpid(x,0) necessary in the case where SIGKILL > wasn't sent? Is waitpid(x,os.NOHANG) enough to clean up the zombie when > SIGTERM succeeds? If so, the waitpid(x,0) could be indented into the "if > not timeout:" block.
In my finding that the previous patch doesn't work and where the problem was I found that waitpid(x,os.NOHANG) is enough. I further found that the except block could be removed if the waitpid(x,os.NOHANG)[1] == 0 check is effectively changed to waitpid(x,os.NOHANG) == (0,0). Anyway, it seems that tee doesn't dump its remaining buffer when it receives a SIGTERM. This patch adds a 0.5 second wait before sending the SIGTERM and a 0.5 second wait between the SIGTERM and the SIGKILL. The process status is checked every 0.01 seconds so the waiting is essentially transparent. P.S. Are the mailing lists slow or have I been kicked off (and thus will never receive an answer) ? -- Jason Stubbs
Index: pym/portage_exec.py =================================================================== --- pym/portage_exec.py (revision 2150) +++ pym/portage_exec.py (working copy) @@ -4,7 +4,7 @@ # $Id: /var/cvsroot/gentoo-src/portage/pym/portage_exec.py,v 1.13.2.4 2005/04/17 09:01:56 jstubbs Exp $ -import os,types,atexit,string,stat +import os,types,atexit,string,stat,time import signal import portage_data import portage_util @@ -180,15 +180,19 @@ retval=os.waitpid(mypid[-1],0)[1] if retval != 0: for x in mypid[0:-1]: - try: - os.kill(x,signal.SIGTERM) - if os.waitpid(x,os.WNOHANG)[1] == 0: - # feisty bugger, still alive. - os.kill(x,signal.SIGKILL) + for sig in (signal.SIGTERM, signal.SIGKILL): + timeout = 50 + while timeout: + if os.waitpid(x, os.WNOHANG) != (0,0): + break + time.sleep(0.01) + timeout -= 1 + if not timeout: + os.kill(x,sig) + else: + break + if not timeout: os.waitpid(x,0) - except OSError, oe: - if oe.errno not in (10,3): - raise oe # at this point we've killed all other kid pids generated via this call. # return now.