[slurm-users] Q about setting up CPU limits

Dj Merrill Wed, 22 Sep 2021 15:59:59 -0700

Hi all,

I'm relatively new to Slurm and my Internet searches so far have turnedup lots of examples from the client perspective, but not from the adminperspective on how to set this up, and I'm hoping someone can point usin the right direction. This should be pretty simple... :-)

We have a test cluster running Slurm 21.08.1 and are trying to figureout how to set a limit of 200 CPU cores that can be requested in apartition. Basically, if someone submits a thousand single CPU corejobs, it should run 200 of them and the other 800 will wait in the queueuntil 1 is finished, then run their next job from the queue, etc, or ifsomeone has a 180 CPU core job running and they submit a 30 CPU corejob, it should wait in the queue until the 180 core job finishes. Ifsomeone submits a job requesting 201 CPU cores, it should fail and givean error.

According to the Slurm resource limits hierarchy, if a partition limitis set, we should be able to setup a user association to override it inthe case where we might want someone to be able to access 300 CPU coresin that partition, for example.

I can see in the Slurm documentation how to setup max nodes perpartition, but have not been able to find how to do this with CPU cores.


My questions are:

1) How do we setup a CPU core limit on a partition that applies to allusers?

2) How do we setup a user association to allow a single person to usemore than the default CPU core limit set on the partition?


3) Is there a better way to accomplish this than the method I'm asking?

For reference, Slurm accounting is setup, GPU allocations are workingproperly, and I think we are close but just missing something obvious tosetup the CPU core limits.



Thank you,


-Dj

null

[slurm-users] Q about setting up CPU limits

Reply via email to