I am trying to write a function to (wall-clock) timelimit a command's execution time but damned if I cannot eliminate all of the races.
Here is my current iteration of the function: 1 timed_run() { 2 local SLEEP_TIME=$1 3 shift 4 5 set +o monitor 6 7 # start command running 8 # NOTE: if you put the & after the bash -c "" the & will background 9 # setsid which causes a race trying to kill the PGID that setsid tries 10 # to create 11 setsid bash -c "$@" & 12 local child_pid=$! 13 # wait for setsid to have done it's work 14 while ! kill -0 -$child_pid 2>/dev/null; do 15 sleep 1 16 done 17 18 # 19 # start the watchdog 20 # - use setsid to put all of the children of the watchdog into their 21 # own process group so that we can use kill -<pgid> to kill them 22 # all off 27 setsid bash -c "sleep $SLEEP_TIME 28 echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\" 29 kill -TERM -$child_pid 2>/dev/null 30 sleep 5 41 kill -KILL -$child_pid 2>/dev/null" & 42 local dog_pid=$! 43 # wait for setsid to have done it's work 44 while ! kill -0 -$dog_pid 2>/dev/null; do 45 sleep 1 46 done 47 48 # waiting on the command will end when either it completes or the 49 # watchdog kills it 50 wait $child_pid 51 52 # status of the child command will be set to 143 if the process had 53 # to be killed due to timeout 54 child_status=${PIPESTATUS[0]} 55 56 # kill off the watchdog 57 kill -TERM -$dog_pid 58 59 # return the command's status back to the caller 60 return $child_status 61} Some notes: * the setsids are needed to allow kill to kill off a whole group of processes * set +o monitor is to disable job control notification messages (i.e. Killed, Done, etc.) * the kill -0 calls are needed to ensure that setsid has finished it's work (i.e. created a new program group) and that a subsequent kill ... -<pgid> will not be called before the setsid is done All of that leaves one race still. The process (group) that setsid creates could be done (and gone) before the kill -0 is called. But if I eliminate the kill -0 then I can't be sure that the setsid has created the process group before calling the kill on it to kill off the processes. I have tried an alternate solution where I can be sure that setsid has done it's work: 1 timed_run() { 2 local SLEEP_TIME=$1 3 shift 4 5 set +o monitor 6 7 # start command running 8 # NOTE: if you put the & after the bash -c "" the & will background 9 # setsid which causes a race trying to kill the PGID that setsid tries 10 # to create 11 setsid bash -c "($@) &" 12 local child_pid=??? 17 18 # 19 # start the watchdog 20 # - use setsid to put all of the children of the watchdog into their 21 # own process group so that we can use kill -<pgid> to kill them 22 # all off 27 setsid bash -c "(sleep $SLEEP_TIME 28 echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\" 29 kill -TERM -$child_pid 2>/dev/null 30 sleep 5 41 kill -KILL -$child_pid 2>/dev/null) &" 42 local dog_pid=??? 47 48 # waiting on the command will end when either it completes or the 49 # watchdog kills it 50 wait $child_pid 51 52 # status of the child command will be set to 143 if the process had 53 # to be killed due to timeout 54 child_status=${PIPESTATUS[0]} 55 56 # kill off the watchdog 57 kill -TERM -$dog_pid 58 59 # return the command's status back to the caller 60 return $child_status 61} Notice that I put the & *inside* the bash commands so that the bash command is backgrounded and not setsid (as in the previous iteration). That allows me to get rid of the kill -0 races on lines 14 and 44 but I lose the ability to learn the pgids that setsid creates which I need to kill on lines 29/41 or 57. Now, the pgids that I want to know are the pids of the two bash commands that the setsids create. So I could almost do: 1 timed_run() { 2 local SLEEP_TIME=$1 3 shift 4 5 set +o monitor 6 7 # start command running 8 # NOTE: if you put the & after the bash -c "" the & will background 9 # setsid which causes a race trying to kill the PGID that setsid tries 10 # to create 11 child_pid=$(setsid bash -c "echo \$\$; ($@) &" 17 18 # 19 # start the watchdog 20 # - use setsid to put all of the children of the watchdog into their 21 # own process group so that we can use kill -<pgid> to kill them 22 # all off 27 dog_pid=$(setsid bash -c "(sleep $SLEEP_TIME 28 echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\" 29 kill -TERM -$child_pid 2>/dev/null 30 sleep 5 41 kill -KILL -$child_pid 2>/dev/null) &") 47 48 # waiting on the command will end when either it completes or the 49 # watchdog kills it 50 wait $child_pid 51 52 # status of the child command will be set to 143 if the process had 53 # to be killed due to timeout 54 child_status=${PIPESTATUS[0]} 55 56 # kill off the watchdog 57 kill -TERM -$dog_pid 58 59 # return the command's status back to the caller 60 return $child_status 61} Except that I want whatever stdout/stderr the command that is in $@ on line 11 produces to stay on stdout/stderr so that timed_run()s caller can collect it. Probably I can put the echo on line 28 to stderr and that will solve the problem for that one. I am still left with the first issue. Any ideas? Thanx, b.
signature.asc
Description: This is a digitally signed message part