Re: [slurm-users] Is it possible to define multiple partitions for the same node, but each one having a different subset of GPUs?

Brian Andrus Wed, 31 Mar 2021 10:48:40 -0700

So the node definition is separate from the partition definition.

You would need to define all the GPUs as part of the node. Partitions donot have physical characteristics, but they do have QOS capabilitiesthat you may be able to use. You could also use a job_submit lua scriptto reject jobs that request resources you do not want used in aparticular queue.

Both would take some research to find the best approach, but I thinkthose are the two options available that may do what you are looking for.


Brian Andrus


On 3/31/2021 8:21 AM, Cristóbal Navarro wrote:

Hi Community,
I was checking the documentation but could find clear information onwhat I am trying to do.Here at the university we have a large compute node with 3 classes ofGPUs. Lets say the node's hostname is "gpuComputer", it is composed of:
  * 4x large GPUs
  * 4x medium GPUs (MIG devices)
  * 16x small GPUs (Mig devices)

Our plan is that we want to have one partition for each class of GPUs.
So if a user chooses the "small" partition, it will only see up to 16xsmall GPUs, and would not interfere with other jobs running on the"medium" or "large" partitions.
Can I create three partitions and specify the corresponding subset ofGPUs for each one?
If not, would NodeName and NodeHostname serve as an alternative way?i.e., to specify the node three times with different NodeName, but allusing the same Hostname=gpuComputer, and specifying the correspondingsubset of "Gres" resources for each one. Then on each partition, tochoose the corresponding NodeName.
Any feedback or advice on the best way to accomplish this would bemuch appreciated.
best regards



--
Cristóbal A. Navarro

Re: [slurm-users] Is it possible to define multiple partitions for the same node, but each one having a different subset of GPUs?

Reply via email to