Hi All,

New using migrating from uge/sge, I'm baffled by the ExitCode recording into 
slurmdb; not sure if this is 'new user' issue or bug, so exposing it here first.


Running simple sbatch scripts with these headers relevant

#!/bin/bash

#SBATCH --mail-user <me>@<work>
#SBATCH --mail-type END

#SBATCH -J T_113491_<redacted>_20150522


The sbatch calls various tools, and terminally a 'completion_reporter' bash 
script reporting whether all calls have proceeded to completion.

If not the return_code from that script is passed into the sbatch script as an 
exit command; the expectation is that the return code for the sbatch script in 
these circumstances is that from the completion_reporter'. That return_code is 
141


GOOD

The emails received have subject line consistent with expectations

'Slurm Job_id=196 Name=T_113491_<redacted>_20150522 Ended, Run time 00:00:24, 
FAILED, ExitCode 141'


UNEXPECTED

However sacct output is not consistent with expectations...

$ sacct -j 196

------------ ---------- ---------- ---------- ---------- ---------- --------
196          T_113491_+ all_slt_l+        slt          1     FAILED     13:0
196.batch         batch                   slt          1     FAILED     13:0



I've spent some time reading through the (excellent, frankly) documentation for 
sbatch and job_exit_code and while learning a great deal nothing has explained 
with anomaly.


Incidentally I expected to be able to use scontrol as below; any pointers on 
the unexpected outcome would be welcome

$ scontrol show step 196.batch
Job step 196.0 not found



We have put a fair bit of work into informatively coding our fail exit_codes so 
suggestions as to what's going on here would be welcome.


Thanks in advance


Matt



**************************************************************************
The information contained in the EMail and any attachments is confidential and 
intended solely and for the attention and use of the named addressee(s). It may 
not be disclosed to any other person without the express authority of Public 
Health England, or the intended recipient, or both. If you are not the intended 
recipient, you must not disclose, copy, distribute or retain this message or 
any part of it. This footnote also confirms that this EMail has been swept for 
computer viruses by Symantec.Cloud, but please re-sweep any attachments before 
opening or saving. http://www.gov.uk/PHE
**************************************************************************

Reply via email to