Dear All, 

Good morning 

I am able to setup a 4 node SLURM system, I am using Ubuntu 22.04 and my SLUM 
is working,

Each of the nodes we have GPU cards, and I am abble to see the information of 
GPU using “Nvidia-smi”




 but when I check for “scontrol show node-1”, not able to see any entry for 
“Grey” , “Gres” valuses shows as null, also in the “CfgTRES” entry also not 
showing the gpu based entry ,  I am pasting my reulsts of “scontrol show 
node-1” , “slurmd -C” amd “nvidia-smi” here for reference    




 "scontrol show node node-1"




NodeName=node-1 Arch=x86_64 CoresPerSocket=1 

   CPUAlloc=0 CPUTot=72 CPULoad=0.03

   AvailableFeatures=(null)

   ActiveFeatures=(null)

   Gres=(null)

   NodeAddr=node-1 NodeHostName=node-1 Version=21.08.5

   OS=Linux 6.2.0-37-generic #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov  2 
18:01:13 UTC 2 

   RealMemory=773685 AllocMem=0 FreeMem=770972 Sockets=72 Boards=1

   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A

   Partitions=debug 

   BootTime=2023-11-23T09:06:28 SlurmdStartTime=2023-11-23T09:07:39

   LastBusyTime=2023-11-23T09:07:40

   CfgTRES=cpu=72,mem=773685M,billing=72

   AllocTRES=

   CapWatts=n/a

   CurrentWatts=0 AveWatts=0

   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s







root@node-1:~# slurmd -C

NodeName=node-1 CPUs=72 Boards=1 SocketsPerBoard=2 CoresPerSocket=18 
ThreadsPerCore=2 RealMemory=773685

UpTime=0-23:48:41







root@node-1:~# nvidia-smi 

Fri Nov 24 08:55:50 2023       

+---------------------------------------------------------------------------------------+

| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 
12.2     |

|-----------------------------------------+----------------------+----------------------+

| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile 
Uncorr. ECC |

| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  
Compute M. |

|                                         |                      |              
 MIG M. |

|=========================================+======================+======================|

|   0  Tesla V100-PCIE-16GB           Off | 00000000:06:00.0 Off |              
      0 |

| N/A   26C    P0              23W / 250W |      4MiB / 16384MiB |      0%      
Default |

|                                         |                      |              
    N/A |

+-----------------------------------------+----------------------+----------------------+

|   1  Tesla V100-PCIE-16GB           Off | 00000000:86:00.0 Off |              
      0 |

| N/A   25C    P0              24W / 250W |      4MiB / 16384MiB |      0%      
Default |

|                                         |                      |              
    N/A |

+-----------------------------------------+----------------------+----------------------+

                                                                                
         

+---------------------------------------------------------------------------------------+

| Processes:                                                                    
        |

|  GPU   GI   CI        PID   Type   Process name                            
GPU Memory |

|        ID   ID                                                             
Usage      |

|=======================================================================================|

|    0   N/A  N/A      2010      G   /usr/lib/xorg/Xorg                         
   4MiB |

|    1   N/A  N/A      2010      G   /usr/lib/xorg/Xorg                         
   4MiB |

+---------------------------------------------------------------------------------------+









Request guidance on what configuration parameters I have missed out, so that I 
am not able to see the GPU part in 

"scontrol show node node-1”


Thanks

Joseph John 















Reply via email to