[slurm-users] Job allocation from a heterogenous pool of nodes

2022-12-07 Thread Le, Viet Duc
Dear slurm community, I am encountering a unique situation where I need to allocate jobs to nodes with different numbers of CPU cores. For instance: node01: Xeon 6226 32 cores node02: EPYC 7543 64 cores $ salloc --partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment

[slurm-users] srun --mem issue

2022-12-07 Thread Felho, Sandor
TransUnion is running a ten-node site using slurm with multiple queues. We have an issue with --mem parameter. The is one user who has read the slurm manual and found the --mem=0. This is giving the maximum memory on the node (500 GiB's) for the single job. How can I block a --mem=0 request? We

Re: [slurm-users] Job allocation from a heterogenous pool of nodes

2022-12-07 Thread Brian Andrus
You may want to look here: https://slurm.schedmd.com/heterogeneous_jobs.html Brian Andrus On 12/7/2022 12:42 AM, Le, Viet Duc wrote: Dear slurm community, I am encountering a unique situation where I need to allocate jobs to nodes with different numbers of CPU cores. For instance: node01

Re: [slurm-users] srun --mem issue

2022-12-07 Thread Moshe Mergy
Hi Sandor I personnaly block "--mem=0" requests in file job_submit.lua (slurm 20.02): if (job_desc.min_mem_per_node == 0 or job_desc.min_mem_per_cpu == 0) then slurm.log_info("%s: ERROR: unlimited memory requested", log_prefix) slurm.log_info("%s: ERROR: job %s from user %s