Hello,

After we set MaxMemPerCPU=9000 on a partition, we are seeing Slurm crash when 
we submit a job with --mem-per-cpu=.

When both -n and --mem-per-cpu= were in the sbatch script,
#SBATCH -n 1
#SBATCH --mem-per-cpu=20000

it worked fine and Slurm automatically increased the number of CPU.
NumNodes=1 NumCPUs=3 NumTasks=1 CPUs/Task=3 ReqB:S:C:T=0:0:*:*
TRES=cpu=3,mem=20001M,node=1,billing=3


However, when only --mem-per-cpu= is was the sbatch script,
#SBATCH --mem-per-cpu=20000

Slurm was crashed with the error.
[2020-03-02T10:36:41.677] _slurm_rpc_submit_batch_job: JobId=5190345 
InitPrio=1000000 usec=423
[2020-03-02T10:36:42.167] error: _compute_c_b_task_dist: request was for 0 
tasks, setting to 1
[2020-03-02T10:36:42.167] error: cons_res: _compute_c_b_task_dist oversubscribe 
for job 5190345
[2020-03-02T10:36:42.167] fatal: cons_res: cpus computation error

In the log Slurm set task to 1, but failed due to oversubscribe.

Any idea how to fix this issue?

Thanks!


Reply via email to