Ciao Massimo,
How about creating another queue cpus_in_the_gpu_nodes (or something less
silly) which targets the GPU nodes but does not allow the allocation of the
GPUs with gres and allocates 96-8 (or whatever other number you deem
appropriate) of the CPUs (and similarly with memory)? Actually it
Yes, I think so, but that should be no problem. I think that requires your
Slurm was built using the --enable-multiple-slurmd configure option, so you
might need to rebuild Slurm, if you didn't use that option in the first
place.
On Mon, Mar 31, 2025 at 7:32 AM Massimo Sgaravatto <
massimo.sgarava
To me at least the simplest solution would be to create 3 partitions.
The first is for the cpu only nodes, the second is the gpu nodes and the
third is a lower priority requeue partition. This is how we do it here.
This way the requeue partition can be used to grab the cpu's on the gpu
nodes wi
What I have done is setup partition QOSes for nodes with 4 GPUs and 64
cores
sacctmgr add qos lcncpu-part
sacctmgr modify qos lcncpu-part set priority=20 \
flags=DenyOnLimit MaxTRESPerNode=cpu=32,gres/gpu=0
sacctmgr add qos lcngpu-part
sacctmgr modify qos lcn-part set priority=20 \
flag
Hi Davide
Thanks for your feedback
If gpu01 and cpusingpu01 are physically the same node, doesn't this mean
that I have to start 2 slurmd on that node (one with "slurmd -N gpu01" and
one with "slurmd -N cpusingpu01") ?
Thanks, Massimo
On Mon, Mar 31, 2025 at 3:22 PM Davide DelVento
wrote:
>