[slurm-users] Job allocation from a heterogenous pool of nodes

2022-12-07 Thread Le, Viet Duc
Dear slurm community,


I am encountering a unique situation where I need to allocate jobs to nodes
with different numbers of CPU cores. For instance:

node01:  Xeon 6226 32 cores

node02:  EPYC 7543 64 cores


$ salloc
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32
--comment=etc

If --ntasks-per-node is larger than 32, the job could not be allocated
since node01 has only 32 cores.


In the context of NVIDIA's HPL container, we need to pin MPI
processes according to NUMA affinity for best performance.

For HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th, and 7th
NUMA domain, respectively.

With --ntasks-per-node=32, only the first half of EPYC's NUMA domain is
available, and we had to assign the 4-7th A100 to 0th and 2nd NUMA domain,
leading to some performance degradation.


I am looking for a way to request more tasks than the number of physically
available cores, i.e.

$ salloc --partition=all --nodes=2 --nodelist=gpu01,gpu02
 --ntasks-per-node=64 --comment=etc


Your suggestions are much appreciated.


Regards,

Viet-Duc


[slurm-users] srun --mem issue

2022-12-07 Thread Felho, Sandor
TransUnion is running a ten-node site using slurm with multiple queues. We have 
an issue with --mem parameter. The is one user who has read the slurm manual 
and found the --mem=0. This is giving the maximum memory on the node (500 
GiB's) for the single job. How can I block a --mem=0 request?

We are running:

  *   OS: RHEL 7
  *   cgroups version 1
  *   slurm: 19.05

Thank you,

Sandor Felho

Sr Consultant, Data Science & Analytics





Re: [slurm-users] Job allocation from a heterogenous pool of nodes

2022-12-07 Thread Brian Andrus

You may want to look here:

https://slurm.schedmd.com/heterogeneous_jobs.html

Brian Andrus

On 12/7/2022 12:42 AM, Le, Viet Duc wrote:


Dear slurm community,


I am encountering a unique situation where I need to allocate jobs to 
nodes with different numbers of CPU cores. For instance:


node01: Xeon 6226 32 cores

node02: EPYC 7543 64 cores


$ salloc 
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment=etc


If --ntasks-per-node is larger than 32, the job could not be allocated 
since node01 has only 32 cores.



In the context of NVIDIA's HPL container, we need to pin MPI 
processes according to NUMA affinity for best performance.


For HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th, and 
7th NUMA domain, respectively.


With --ntasks-per-node=32, only the first half of EPYC's NUMA domain 
is available, and we had to assign the 4-7th A100 to 0th and 2nd NUMA 
domain, leading to some performance degradation.



I am looking for a way to request more tasks than the number of 
physically available cores, i.e.


$ salloc 
--partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=64--comment=etc



Your suggestions are much appreciated.


Regards,

Viet-Duc


Re: [slurm-users] srun --mem issue

2022-12-07 Thread Moshe Mergy
Hi Sandor


I personnaly block "--mem=0" requests in file job_submit.lua (slurm 20.02):


   if (job_desc.min_mem_per_node == 0  or  job_desc.min_mem_per_cpu == 0) then
slurm.log_info("%s: ERROR: unlimited memory requested", log_prefix)
slurm.log_info("%s: ERROR: job %s from user %s rejected because of an 
invalid (unlimited) memory request.", log_prefix, job_desc.name, 
job_desc.user_name)
slurm.log_user("Job rejected because of an invalid memory request.")
return slurm.ERROR
   end

Maybe there is a better or nicer solution...

All the best
Moshe



From: slurm-users  on behalf of Felho, 
Sandor 
Sent: Wednesday, December 7, 2022 7:03 PM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] srun --mem issue

TransUnion is running a ten-node site using slurm with multiple queues. We have 
an issue with --mem parameter. The is one user who has read the slurm manual 
and found the --mem=0. This is giving the maximum memory on the node (500 
GiB's) for the single job. How can I block a --mem=0 request?

We are running:

  *   OS: RHEL 7
  *   cgroups version 1
  *   slurm: 19.05

Thank you,

Sandor Felho

Sr Consultant, Data Science & Analytics