Dear All,
Good morning I am able to setup a 4 node SLURM system, I am using Ubuntu 22.04 and my SLUM is working, Each of the nodes we have GPU cards, and I am abble to see the information of GPU using “Nvidia-smi” but when I check for “scontrol show node-1”, not able to see any entry for “Grey” , “Gres” valuses shows as null, also in the “CfgTRES” entry also not showing the gpu based entry , I am pasting my reulsts of “scontrol show node-1” , “slurmd -C” amd “nvidia-smi” here for reference "scontrol show node node-1" NodeName=node-1 Arch=x86_64 CoresPerSocket=1 CPUAlloc=0 CPUTot=72 CPULoad=0.03 AvailableFeatures=(null) ActiveFeatures=(null) Gres=(null) NodeAddr=node-1 NodeHostName=node-1 Version=21.08.5 OS=Linux 6.2.0-37-generic #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2 RealMemory=773685 AllocMem=0 FreeMem=770972 Sockets=72 Boards=1 State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A Partitions=debug BootTime=2023-11-23T09:06:28 SlurmdStartTime=2023-11-23T09:07:39 LastBusyTime=2023-11-23T09:07:40 CfgTRES=cpu=72,mem=773685M,billing=72 AllocTRES= CapWatts=n/a CurrentWatts=0 AveWatts=0 ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s root@node-1:~# slurmd -C NodeName=node-1 CPUs=72 Boards=1 SocketsPerBoard=2 CoresPerSocket=18 ThreadsPerCore=2 RealMemory=773685 UpTime=0-23:48:41 root@node-1:~# nvidia-smi Fri Nov 24 08:55:50 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Tesla V100-PCIE-16GB Off | 00000000:06:00.0 Off | 0 | | N/A 26C P0 23W / 250W | 4MiB / 16384MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 Tesla V100-PCIE-16GB Off | 00000000:86:00.0 Off | 0 | | N/A 25C P0 24W / 250W | 4MiB / 16384MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 2010 G /usr/lib/xorg/Xorg 4MiB | | 1 N/A N/A 2010 G /usr/lib/xorg/Xorg 4MiB | +---------------------------------------------------------------------------------------+ Request guidance on what configuration parameters I have missed out, so that I am not able to see the GPU part in "scontrol show node node-1” Thanks Joseph John