Hello, we are experiencing troubles with gang scheduling once GPUs are added in the consideration. We are using the following slurm.conf settings:
ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup SchedulerType=sched/backfill SchedulerTimeSlice=60 SelectType=select/cons_tres SelectTypeParameters=CR_CPU_Memory PreemptType=preempt/qos PreemptMode=SUSPEND, GANG PreemptExemptTime=-1 NodeName=cn2 Sockets=4 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=56000 Gres=gpu:geforce_gtx_1080_ti:2 [...] PartitionName=main Nodes=cn2,cn3,cn4 Default=YES MaxTime=INFINITE State=UP OverSubscribe=FORCE:4 [...] ---------------------------------------------------------------------------- We use the QoS-based preemption to run lower priority tasks getting pre-empted automatically when higher priority tasks arrive in the queue, which works nicely. When we run several GPU tasks using sbatch with a script as shown below, we see that these tasks don't get gang-scheduled, without any apparent error message in the logs. For jobs involving only CPUs it works as expected. We didn't see any specific comments regarding GPUs in the gang scheduling documentation - are we trying something which is not supported or are we doing it wrong? Also, is there a way to obtain more detailed logs/insights into how the system practically decides when to form a gang or not? #!/bin/bash #SBATCH --nodes=1 #SBATCH --gpus=2 #SBATCH --partition=main,interactive IMAGES_DIR="/path/to/images" IMAGE="nvcr.io/nvidia/cuda:10.0-base" srun --container-image="$IMAGES_DIR/$IMAGE.sqsh" bash ... ---------------------------------------------------------------------------- Thanks for reading & have a nice weekend Tilman