I have a standalone server with 4 GeForce RTX 2080 Ti. The purpose is to serve 
as a computer server for data science jobs. My department chair wants a job 
scheduler on it. I have installed SLURM (18.08.9). That works just fine in a 
basic configuration when I attempt to add Gres_Types gpu and then add 
Gres:gpu:4 to the end of the node description:


NodeName=cs-datasci CPUs=24 RealMemory=385405 Sockets=2 CoresPerSocket=6 
ThreadsPerCore=2 State=UNKNOWN Gres=gpu:4

and then try to restart slurmd I get an error that it cannot find the plugin

slurmd: error: Couldn't find the specified plugin name for select/cons_tres 
looking at all files

slurmd: error: cannot find select plugin for select/cons_tres

slurmd: fatal: Can't find plugin for select/cons_tres

The system was prebuilt by AdvancedHPC with CentOS 7 and CUDA 8.0

I usually keep notes when I'm installing things but in this case I wasn't 
jotting things down as I went. I think I started with the instructions on this 
page: https://slurm.schedmd.com/quickstart_admin.html and went with the usual 
./configure, make, make install.

I have a feeling maybe something did not work and I switched to the rpm 
packages based on some other web pages I saw because if I do a yum list 
installed | grep slurm I see a lot of pacakages. The problem is I was 
interrupted with other tasks and my memory was somewhat rusty when I came back 
to this.

When I went looking for this error I saw there were some issues with the newest 
SLURM and CUDA 10.2 but I didn't think that should be an issue because I was at 
CUDA 8.0.  Just in case I backed down to SLURM 18.

I'm willing to start all over if anyone thinks cleaning up and rebuilding will 
help that. I do see libraries in /etc/lib64/slurm but I also see 2 files in 
/usr/local/lib/slurm/src so I'm not sure if that's left over from trying to 
install from source.  All the daemons are in /usr/sbin and user commands in 
/usr/bin

I'm a newbie at this and very frustrated. Can anyone help?

***************************************************************

Lisa Weihl Systems Administrator

Computer Science, Bowling Green State University
Tel: (419) 372-0116   |    Fax: (419) 372-8061
lwe...@bgsu.edu
www.bgsu.edu​

Reply via email to