Re: [slurm-users] Compiling Slurm with nvml support

Paul Edmon Thu, 24 Sep 2020 12:40:28 -0700

That's what we do here.  We have three different rpms we build.


server: because we run the latest MariaDB on our master

general compute

gpu compute: because we build against nvml

We name these all the same but have them in different repos anddistribute the repos to each node appropriately.

We also have a git repo in which we manage our slurm.spec file with abranch for each version and type so we can keep organized.


-Paul Edmon-

On 9/24/2020 3:31 PM, Dana, Jason T. wrote:

Hello,

I hopefully have a quick question.
I have compiled Slurm RPMs on a CentOS system with nvidia driversinstalled so that I can utilize AutoDetect=nvml configuration in ourGPU nodes’ gres.conf. All seems to be going well on the GPU nodessince I have done that. I was unable to install the slurm RPM on thecontrol/master node as the RPM required libnvidia-ml.so to beinstalled. The control/master and other compute nodes don’t have anynvidia cards attached to them, so I believed installing the driversjust to satisfy this requirement might not be the best idea. Irecreated the RPM without the drivers present to get around this andeverything has been working great as far as I can tell.
I am now working on adding pmix support that I didn’t properly addinitially and am encountering this situation again. I figured I wouldsend up a flag and see if maybe I am going about this the wrong way.Is it typical to have to compile the slurm RPMs for different types ofnodes or am I completely going about this the wrong way?
Thanks in advance!

Jason

Re: [slurm-users] Compiling Slurm with nvml support

Reply via email to