Hello, I recently set up a small cluster at work using Warewulf/Slurm. Currently, I am not able to get the scheduler to work well with GPU's (Gres).
While slurm is able to filter by GPU type, it allocates all the GPU's on the node. See below: [abhiram@whale ~]$ srun --gres=gpu:p100:2 -n 1 --partition=gpu nvidia-smi > --query-gpu=index,name --format=csv > index, name > 0, Tesla P100-PCIE-16GB > 1, Tesla P100-PCIE-16GB > 2, Tesla P100-PCIE-16GB > 3, Tesla P100-PCIE-16GB > [abhiram@whale ~]$ srun --gres=gpu:titanrtx:2 -n 1 --partition=gpu > nvidia-smi --query-gpu=index,name --format=csv > index, name > 0, TITAN RTX > 1, TITAN RTX > 2, TITAN RTX > 3, TITAN RTX > 4, TITAN RTX > 5, TITAN RTX > 6, TITAN RTX > 7, TITAN RTX > I am fairly new to Slurm and still figuring out my way around it. I would really appreciate any help with this. For your reference, I attached the slurm.conf and gres.conf files. Best, Abhiram -- Abhiram Chintangal QB3 Nogales Lab Bioinformatics Specialist @ Howard Hughes Medical Institute University of California Berkeley 708D Stanley Hall, Berkeley, CA 94720 Phone (510)666-3344
slurm.conf
Description: Binary data
gres.conf
Description: Binary data