Hmm, OK - but that is the only nvml.h I can find, as shown by the find command. I downloaded the official NVIDIA-Linux-x86_64-535.54.03.run and ran it successfully; do I need to install something else beside? A google search for 'CUDA SDK' leads directly to NVIDIA's page: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html


On 19/07/2023 12:26, Timo Rothenpieler wrote:
On 19/07/2023 11:47, Jan Andersen wrote:
I'm trying to build slurm with nvml support, but configure doesn't find it:

root@zorn:~/slurm-23.02.3# ./configure --with-nvml
...
checking for hwloc installation... /usr
checking for nvml.h... no
checking for nvmlInit in -lnvidia-ml... yes
configure: error: unable to locate libnvidia-ml.so and/or nvml.h

But:

root@zorn:~/slurm-23.02.3# find / -xdev -name nvml.h
/usr/include/hwloc/nvml.h

It's not looking for the hwloc header, but for the nvidia one.
If you have your CUDA SDK installed in for example /opt/cuda, you got to point it there: --with-nvml=/opt/cuda

root@zorn:~/slurm-23.02.3# find / -xdev -name libnvidia-ml.so
/usr/lib32/libnvidia-ml.so
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so

I tried to figure out how to tell configure where to find them, but the script is a bit eye-watering; how should I do?




Reply via email to