[PATCH 1/2] remote: Use `catch' in killing pending force-kills

Maciej W. Rozycki Wed, 20 May 2020 14:23:07 -0700

Address an execution race in `close_wait_program' and use `catch' in 
killing pending force-kills issued there in the recovery of a stuck test 
case, in case the force-kill sequence has completed before the command 
to kill the sequence had a chance to run, so that no error is thrown and 
a testsuite run does not get interrupted early like:


PASS: gcc.c-torture/execute/postmod-1.c   -O0  (test for excess errors)
Executing on remote-localhost: .../gcc/testsuite/gcc/postmod-1.exe    (timeout 
= 15)
spawn [open ...]
WARNING: program timed out
ERROR: tcl error sourcing .../gcc/testsuite/gcc.c-torture/execute/execute.exp.
ERROR: child process exited abnormally
    while executing
"exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid""
    (procedure "close_wait_program" line 57)
    invoked from within
"close_wait_program $spawn_id $pid wres"
    (procedure "local_exec" line 104)
[...]
"uplevel #0 source .../gcc/testsuite/gcc.c-torture/execute/execute.exp"
    invoked from within
"catch "uplevel #0 source $test_file_name""
testcase .../gcc/testsuite/gcc.c-torture/execute/execute.exp completed in 196 
seconds

                === gcc Summary ===

# of expected passes            1

-- therefore not letting `execute.exp' continue (here with the GCC `c' 
testsuite invoked with `execute.exp=postmod-1.c' for 8 compilation and 8 
execution tests).

The completion of the force-kill sequence would have to happen in the 
window between the `wait' command has returned, which would at worst 
happen as a result of the final `kill -9' command in the sequence, and 
the `kill -9 $exec_pid' command issued here, and the `sleep 5' command 
issued at the end of the force-kill sequence makes the likelihood of 
such a scenario low, but this might still happen with a loaded host 
system and there is no drawback from using `catch' here, so let's do it.

        * lib/remote.exp (close_wait_program): Use `catch' in killing 
        pending force-kills.

Signed-off-by: Maciej W. Rozycki <ma...@wdc.com>
---
Hi,

 I have only observed it in a debug scenario, where an artificial delay 
was inserted before the `wait' command referred in the change description, 
while tracking down a testsuite hang with a stuck test case, but as noted 
the use of `catch' here is otherwise harmless and while the likelihood of 
the scenario where the race triggers might be epsilon it is not nil.

 Therefore, please apply.  FAOD this has been formatted for `git am' use.

  Maciej
---
 lib/remote.exp |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

dejagnu-remote-close-wait-kill-catch.diff
Index: dejagnu/lib/remote.exp
===================================================================
--- dejagnu.orig/lib/remote.exp
+++ dejagnu/lib/remote.exp
@@ -113,7 +113,10 @@ proc close_wait_program { program_id pid
        # We reaped the process, so cancel the pending force-kills, as
        # otherwise if the PID is reused for some other unrelated
        # process, we'd kill the wrong process.
-       exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid"
+       #
+       # Use `catch' in case the force-kills have completed, so as not
+       # to cause TCL to choke if `kill' returns a failure.
+       catch "exec sh -c \"exec > /dev/null 2>&1 && kill -9 $exec_pid\""
     }
 
     return $res

[PATCH 1/2] remote: Use `catch' in killing pending force-kills

Reply via email to