Hello all,

I have a GPU node with 3 identical GPUs (we started with two and recently
added the third). Running nvidia-smi correctly shows that all three are
recognized. My gres.conf file has only this line:

NodeName=gpu01 File=/dev/nvidia[0-2] Type=quadro_8000 Name=gpu Count=3

And the relevant lines in slurm.conf are:

NodeName=gpu01 Sockets=2 CoresPerSocket=16 ThreadsPerCore=1
RealMemory=189900 State=UNKNOWN Gres=gpu:quadro_8000:3

As far as I can tell, all of this is fine (and we had no issues when we
only had the initial two GPUs in the system). However, now when I run sinfo
-o %G (which as I understand will report the total number of gres resources
available), this is the output:

GRES
(null)
gpu:quadro_8000:2

Is this saying that it doesn't recognize the third card? Any suggestions?
As always, thank you for your help!

Warmest regards,
Jason

-- 
*Jason L. Simms, Ph.D., M.P.H.*
Manager of Research and High-Performance Computing
XSEDE Campus Champion
Lafayette College
Information Technology Services
710 Sullivan Rd | Easton, PA 18042
Office: 112 Skillman Library
p: (610) 330-5632

Reply via email to