[slurm-users] Node is not allocating all CPUs

Guertin, David S. Tue, 05 Apr 2022 14:12:45 -0700

We've added a new GPU node to our cluster with 32 cores. It contains 2 16-core 
sockets, and hyperthreading is turned off, so the total is 32 cores. But jobs 
are only being allowed to use 16 cores.


Here's the relevant line from slurm.conf:

NodeName=node020 CoresPerSocket=16 RealMemory=257600 ThreadsPerCore=1 Boards=1 
SocketsPerBoard=2 Weight=100 Gres=gpu:rtxa5000:4

And here's scontrol output for the node. Note that even though CPUTot=32, 
CfgTRES=cpu=16 instead of 32:

# scontrol show node node020
NodeName=node020 Arch=x86_64 CoresPerSocket=16
   CPUAlloc=16 CPUTot=32 CPULoad=7.29
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=gpu:rtxa5000:4
   NodeAddr=node020 NodeHostName=node020 Version=19.05.8
   OS=Linux 3.10.0-1160.59.1.el7.x86_64 #1 SMP Wed Feb 23 16:47:03 UTC 2022
   RealMemory=257600 AllocMem=126976 FreeMem=1393 Sockets=2 Boards=1
   State=MIXED ThreadsPerCore=1 TmpDisk=2038 Weight=100 Owner=N/A MCS_label=N/A
   Partitions=gpu-long,gpu-short,gpu-standard
   BootTime=2022-04-05T11:37:08 SlurmdStartTime=2022-04-05T11:43:00
   CfgTRES=cpu=16,mem=257600M,billing=16,gres/gpu=4
   AllocTRES=cpu=16,mem=124G,gres/gpu=2
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Why isn't this node allocating all 32 cores?

Thanks,
David Guertin

[slurm-users] Node is not allocating all CPUs

Reply via email to