Thanks for confirming the issue.
I found the source of the problem with the help of SchedMD support.
18.08.4 has this bugfix to prevent commands in the cwd from taking
precedence over commands in your PATH:
https://github.com/SchedMD/slurm/commit/ccafaf7b60090155639edcbdbf4a3ab5e36967c6
There is a command /usr/bin/batch which is part of the at package:
$ which batch
/usr/bin/batch
$ rpm -qf /usr/bin/batch
at-3.1.10-49.el6.x86_64
I'm sure just about every Linux system has at installed. As a result,
sbatch batch
becomes
sbatch /usr/bin/batch
The fix is to use a relative or absolute path to your batch file, like
this:
sbatch ./batch
SchedMD support told me to send them the output of sbatch -v batch, when
I ran that command, I saw this in the output:
sbatch: remote command : `/usr/bin/batch'
Once I saw that, I understood what was going on, and SchedMD support
confirmed that was caused by a bugfix in 18.08.4
Prentice
On 12/19/18 2:43 PM, mercan wrote:
Hi;
We upgraded from 18.08.3 to 18.08.4 and there is a job_submit.lua
script also. And nearly same issue at our cluster:
$ sbatch batch
sbatch: error: Batch job submission failed: Unspecified error
$ mv batch nobatchy
$ sbatch nobatchy
Submitted batch job 172174
I hope this helps.
Ahmet M.
19.12.2018 21:54 tarihinde Prentice Bisbal yazdı:
Once I saw that, I understood what the problem was,
Yesterday I upgraded from 18.08.3 to 18.08.4. After the upgrade, I
found that batch scripts named "batch" are being rejected. Simply
changing the script name fixes the problem. For example:
$ sbatch batch
sbatch: error: ERROR: A time limit must be specified
sbatch: error: Batch job submission failed: Time limit specification
required, but not provided
$ mv batch different_name
$ sbatch different_name
Submitted batch job 398507
Not sure if this is a bug in sbatch, my job_submit.lua file, or the
lua plugin. My job_submit.lua script hasn't been modified since
10/16/2018. I was using 18.08.3 since 11/20, and the user that
reported this has used the same batch script to submit jobs prior to
the update w/o any issues.
Has anyone else upgraded to 18.08.4? If so, can you replicate this
issue? I have already reported this to SchedMD. This BugID is 6271.