These log lines about the prolog script looks very suspicious to me:
[2020-11-18T10:19:35.388] debug: [job 110] attempting to run prolog
[/cm/local/apps/cmd/scripts/prolog]
then
[2020-11-18T10:21:10.121] debug: Waiting for job 110's prolog to complete
[2020-11-18T10:21:10.121] debug: Finis
The epilog script does have exit 0 set at the end. Epilogs exit cleanly
when run.
With log set to debug5 I get the following results for any scancel call.
Submit host slurmctld.log
[2020-11-18T10:19:34.944] _slurm_rpc_submit_batch_job: JobId=110
InitPrio=110503 usec=191
[2020-11-18T10:19:35.
Hi;
Check epilog return value which comes from the return value of the last
line of epilog script. Also, you can add a "exit 0" line at the last
line of the epilog script to ensure to get a zero return value for
testing purpose.
Ahmet M.
18.11.2020 20:00 tarihinde William Markuske yazdı: