[slurm-users] schedule mixed nodes first

Durai Arasan Fri, 14 May 2021 14:54:17 -0700

Hi,

Frequently all of our GPU nodes (8xGPU each) are in MIXED state and there
is no IDLE node. Some jobs require a complete node (all 8 GPUs) and such
jobs therefore have to wait really long before they can run.


Is there a way of improving this situation? E.g. by not blocking IDLE nodes
with jobs that only use a fraction of the 8 GPUs? Why are single GPU jobs
not scheduled to fill already MIXED nodes before using IDLE ones?

What parameters/configuration need to be adjusted for this to be enforced?

Our current scheduling configuration:

slurm.conf:
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory

gres.conf (one node example):
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[0-3]
COREs=0-17,36-53
NodeName=gpu-6 Name=gpu Type=rtx2080ti File=/dev/nvidia[4-7]
COREs=18-35,54-71


Thank you,
Durai
Competence center for Machine Learning Tübingen

[slurm-users] schedule mixed nodes first

Reply via email to