Thanks for confirming the issue.

I found the source of the problem with the help of  SchedMD support. 18.08.4 has this bugfix to prevent commands in the cwd from taking precedence over commands in your PATH:

https://github.com/SchedMD/slurm/commit/ccafaf7b60090155639edcbdbf4a3ab5e36967c6

There is a command /usr/bin/batch which is part of the at package:

$ which batch
/usr/bin/batch

$ rpm -qf /usr/bin/batch
at-3.1.10-49.el6.x86_64

I'm sure just about every Linux system has at installed. As a result,

sbatch batch

becomes

sbatch /usr/bin/batch

The fix is to use a relative or absolute path to your batch file, like this:

sbatch ./batch

SchedMD support told me to send them the output of sbatch -v batch, when I ran that command, I saw this in the output:

sbatch: remote command    : `/usr/bin/batch'

Once I saw that, I understood what was going on, and SchedMD support confirmed that was caused by a bugfix in 18.08.4

Prentice

On 12/19/18 2:43 PM, mercan wrote:
Hi;

We upgraded from 18.08.3 to 18.08.4 and there is a job_submit.lua script also. And nearly same issue at our cluster:

$ sbatch batch
sbatch: error: Batch job submission failed: Unspecified error
$ mv batch nobatchy
$ sbatch nobatchy
Submitted batch job 172174

I hope this helps.

Ahmet M.


19.12.2018 21:54 tarihinde Prentice Bisbal yazdı:
Once I saw that, I understood what the problem was,
Yesterday I upgraded from 18.08.3 to 18.08.4. After the upgrade, I found that batch scripts named "batch" are being rejected. Simply changing the script name fixes the problem. For example:

$ sbatch batch
sbatch: error: ERROR: A time limit must be specified
sbatch: error: Batch job submission failed: Time limit specification required, but not provided

$ mv batch different_name

$ sbatch different_name
Submitted batch job 398507

Not sure if this is a bug in sbatch, my job_submit.lua file, or the lua plugin. My job_submit.lua script hasn't been modified since 10/16/2018. I was using 18.08.3 since 11/20, and the user that reported this has used the same batch script to submit jobs prior to the update w/o any issues.

Has anyone else upgraded to 18.08.4? If so, can you replicate this issue? I have already reported this to SchedMD. This BugID is 6271.


Reply via email to