Hmm, OK - but that is the only nvml.h I can find, as shown by the find
command. I downloaded the official NVIDIA-Linux-x86_64-535.54.03.run and
ran it successfully; do I need to install something else beside? A
google search for 'CUDA SDK' leads directly to NVIDIA's page:
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
On 19/07/2023 12:26, Timo Rothenpieler wrote:
On 19/07/2023 11:47, Jan Andersen wrote:
I'm trying to build slurm with nvml support, but configure doesn't
find it:
root@zorn:~/slurm-23.02.3# ./configure --with-nvml
...
checking for hwloc installation... /usr
checking for nvml.h... no
checking for nvmlInit in -lnvidia-ml... yes
configure: error: unable to locate libnvidia-ml.so and/or nvml.h
But:
root@zorn:~/slurm-23.02.3# find / -xdev -name nvml.h
/usr/include/hwloc/nvml.h
It's not looking for the hwloc header, but for the nvidia one.
If you have your CUDA SDK installed in for example /opt/cuda, you got to
point it there: --with-nvml=/opt/cuda
root@zorn:~/slurm-23.02.3# find / -xdev -name libnvidia-ml.so
/usr/lib32/libnvidia-ml.so
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so
I tried to figure out how to tell configure where to find them, but
the script is a bit eye-watering; how should I do?