Re: [slurm-users] Autodetect of nvml is not working in gres.conf

2023-11-30 Thread Shunran Zhang
Hi all, Apologies for writing something misleading in the last mail. I missed your error message. Rob was correct - your slurmd appears not to have the NVML flag on compile time. You need to set up the NVML and turn the --with-nvml flag on when configuring slurm to fix the issue if you are compil

Re: [slurm-users] Autodetect of nvml is not working in gres.conf

2023-11-30 Thread Shunran Zhang
Hi all, If you could offer a little bit more details on your OS and Slurm version that might shed some light. There is an interesting detail about the NVML package if you are using RHEL-like OS. The NVML detection part of the slurm library (/usr/lib64/slurm/gpu_nvml.so) is linked against the /lib

Re: [slurm-users] Autodetect of nvml is not working in gres.conf

2023-11-30 Thread Groner, Rob
Did you have --with-nvml as part of your configuration? Go back to your config.log and verify that it ever said it found nvml.h. If not, then you'll need to make sure you have the right nvidia/cuda packages installed on the host you're building slurm on, and you might have to specify --with-nv

Re: [slurm-users] Autodetect of nvml is not working in gres.conf

2023-11-30 Thread Josef Dvoracek
couldn't be that library "cuda-nvml-devel" was not installed when you were building slurm? cheers josef On 30. 11. 23 15:06, Ravi Konila wrote: Hello, My gres.conf has AutoDetect=nvml when I restart slurmd service I do get *fatal: We were configured to autodetect nvml functionality, but we w