Address an execution race in `close_wait_program' and use `catch' in killing pending force-kills issued there in the recovery of a stuck test case, in case the force-kill sequence has completed before the command to kill the sequence had a chance to run, so that no error is thrown and a testsuite run does not get interrupted early like:
PASS: gcc.c-torture/execute/postmod-1.c -O0 (test for excess errors) Executing on remote-localhost: .../gcc/testsuite/gcc/postmod-1.exe (timeout = 15) spawn [open ...] WARNING: program timed out ERROR: tcl error sourcing .../gcc/testsuite/gcc.c-torture/execute/execute.exp. ERROR: child process exited abnormally while executing "exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid"" (procedure "close_wait_program" line 57) invoked from within "close_wait_program $spawn_id $pid wres" (procedure "local_exec" line 104) [...] "uplevel #0 source .../gcc/testsuite/gcc.c-torture/execute/execute.exp" invoked from within "catch "uplevel #0 source $test_file_name"" testcase .../gcc/testsuite/gcc.c-torture/execute/execute.exp completed in 196 seconds === gcc Summary === # of expected passes 1 -- therefore not letting `execute.exp' continue (here with the GCC `c' testsuite invoked with `execute.exp=postmod-1.c' for 8 compilation and 8 execution tests). The completion of the force-kill sequence would have to happen in the window between the `wait' command has returned, which would at worst happen as a result of the final `kill -9' command in the sequence, and the `kill -9 $exec_pid' command issued here, and the `sleep 5' command issued at the end of the force-kill sequence makes the likelihood of such a scenario low, but this might still happen with a loaded host system and there is no drawback from using `catch' here, so let's do it. * lib/remote.exp (close_wait_program): Use `catch' in killing pending force-kills. Signed-off-by: Maciej W. Rozycki <ma...@wdc.com> --- Hi, Please apply. FAOD this has been formatted for `git am' use. Maciej Changes from v1: - Simplify `catch' invocation, no functional change. --- lib/remote.exp | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) dejagnu-remote-close-wait-kill-catch.diff Index: dejagnu/lib/remote.exp =================================================================== --- dejagnu.orig/lib/remote.exp +++ dejagnu/lib/remote.exp @@ -113,7 +113,10 @@ proc close_wait_program { program_id pid # We reaped the process, so cancel the pending force-kills, as # otherwise if the PID is reused for some other unrelated # process, we'd kill the wrong process. - exec sh -c "exec > /dev/null 2>&1 && kill -9 $exec_pid" + # + # Use `catch' in case the force-kills have completed, so as not + # to cause TCL to choke if `kill' returns a failure. + catch {exec sh -c "kill -9 $exec_pid" >& /dev/null} } return $res