time limiting command execution

Brian J. Murrell Thu, 17 Jan 2008 13:47:35 -0800

I am trying to write a function to (wall-clock) timelimit a command's
execution time but damned if I cannot eliminate all of the races.


Here is my current iteration of the function:

 1 timed_run() {
 2    local SLEEP_TIME=$1
 3    shift
 4
 5    set +o monitor
 6
 7    # start command running
 8    # NOTE: if you put the & after the bash -c "" the & will background
 9    # setsid which causes a race trying to kill the PGID that setsid tries
10    # to create
11    setsid bash -c "$@" &
12    local child_pid=$!
13    # wait for setsid to have done it's work
14    while ! kill -0 -$child_pid 2>/dev/null; do
15        sleep 1
16    done
17    
18    #
19    # start the watchdog
20    # - use setsid to put all of the children of the watchdog into their
21    #   own process group so that we can use kill -<pgid> to kill them
22    #   all off
27    setsid bash -c "sleep $SLEEP_TIME
28    echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\"
29    kill -TERM -$child_pid 2>/dev/null
30    sleep 5
41    kill -KILL -$child_pid 2>/dev/null" &
42    local dog_pid=$!
43    # wait for setsid to have done it's work
44    while ! kill -0 -$dog_pid 2>/dev/null; do
45        sleep 1
46    done
47
48    # waiting on the command will end when either it completes or the
49    # watchdog kills it
50    wait $child_pid
51
52    # status of the child command will be set to 143 if the process had
53    # to be killed due to timeout
54    child_status=${PIPESTATUS[0]}
55
56    # kill off the watchdog
57    kill -TERM -$dog_pid
58
59    # return the command's status back to the caller
60    return $child_status
61}

Some notes:
      * the setsids are needed to allow kill to kill off a whole group
        of processes
      * set +o monitor is to disable job control notification messages
        (i.e. Killed, Done, etc.)
      * the kill -0 calls are needed to ensure that setsid has finished
        it's work (i.e. created a new program group) and that a
        subsequent kill ... -<pgid> will not be called before the setsid
        is done

All of that leaves one race still.  The process (group) that setsid
creates could be done (and gone) before the kill -0 is called.  But if I
eliminate the kill -0 then I can't be sure that the setsid has created
the process group before calling the kill on it to kill off the
processes.

I have tried an alternate solution where I can be sure that setsid has
done it's work:

 1 timed_run() {
 2    local SLEEP_TIME=$1
 3    shift
 4
 5    set +o monitor
 6
 7    # start command running
 8    # NOTE: if you put the & after the bash -c "" the & will background
 9    # setsid which causes a race trying to kill the PGID that setsid tries
10    # to create
11    setsid bash -c "($@) &"
12    local child_pid=???
17    
18    #
19    # start the watchdog
20    # - use setsid to put all of the children of the watchdog into their
21    #   own process group so that we can use kill -<pgid> to kill them
22    #   all off
27    setsid bash -c "(sleep $SLEEP_TIME
28    echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\"
29    kill -TERM -$child_pid 2>/dev/null
30    sleep 5
41    kill -KILL -$child_pid 2>/dev/null) &"
42    local dog_pid=???
47
48    # waiting on the command will end when either it completes or the
49    # watchdog kills it
50    wait $child_pid
51
52    # status of the child command will be set to 143 if the process had
53    # to be killed due to timeout
54    child_status=${PIPESTATUS[0]}
55
56    # kill off the watchdog
57    kill -TERM -$dog_pid
58
59    # return the command's status back to the caller
60    return $child_status
61}

Notice that I put the & *inside* the bash commands so that the bash
command is backgrounded and not setsid (as in the previous iteration).
That allows me to get rid of the kill -0 races on lines 14 and 44 but I
lose the ability to learn the pgids that setsid creates which I need to
kill on lines 29/41 or 57.

Now, the pgids that I want to know are the pids of the two bash commands
that the setsids create.  So I could almost do:

 1 timed_run() {
 2    local SLEEP_TIME=$1
 3    shift
 4
 5    set +o monitor
 6
 7    # start command running
 8    # NOTE: if you put the & after the bash -c "" the & will background
 9    # setsid which causes a race trying to kill the PGID that setsid tries
10    # to create
11    child_pid=$(setsid bash -c "echo \$\$; ($@) &"
17    
18    #
19    # start the watchdog
20    # - use setsid to put all of the children of the watchdog into their
21    #   own process group so that we can use kill -<pgid> to kill them
22    #   all off
27    dog_pid=$(setsid bash -c "(sleep $SLEEP_TIME
28    echo \"$1 was killed due to a $SLEEP_TIME second timeout expiry\"
29    kill -TERM -$child_pid 2>/dev/null
30    sleep 5
41    kill -KILL -$child_pid 2>/dev/null) &")
47
48    # waiting on the command will end when either it completes or the
49    # watchdog kills it
50    wait $child_pid
51
52    # status of the child command will be set to 143 if the process had
53    # to be killed due to timeout
54    child_status=${PIPESTATUS[0]}
55
56    # kill off the watchdog
57    kill -TERM -$dog_pid
58
59    # return the command's status back to the caller
60    return $child_status
61}

Except that I want whatever stdout/stderr the command that is in $@ on
line 11 produces to stay on stdout/stderr so that timed_run()s caller
can collect it.  Probably I can put the echo on line 28 to stderr and
that will solve the problem for that one.  I am still left with the
first issue.

Any ideas?

Thanx,
b.

signature.asc
Description: This is a digitally signed message part

time limiting command execution

Reply via email to