As I said at the beginning, I have never played with MPS, so my answer is based 
only on what the Slurm documentation shows.
Apparently MPS does not require NVML, hence you can avoid setting AutoDetect 
and instead list the GPU resources in the gres.conf file old style. That should 
help you to get over that fatal error without having to rebuild Slurm from 
sources.

--
Davide Vanzo, PhD
Computer Scientist
BioHPC – Lyda Hill Dept. of Bioinformatics
UT Southwestern Medical Center

From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Robert 
Kudyba
Sent: Wednesday, April 8, 2020 4:50 PM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Header lengths are longer than data received after 
changing SelectType & GresTypes to use MPS

EXTERNAL MAIL
> use yum install slurm20, here they show Slurm 19 but it's the same for 20

In that case you'll need to open a bug with Bright to get them to
rebuild Slurm with nvml support.

They told me they don't officially support MPS nor Slurm and to come here to 
get support (or pay SchedMD).

The vicious cycle continues.

Since all I want it MPS enabled from 
https://slurm.schedmd.com/gres.html#MPS_config_example_2
"CUDA Multi-Process Service (MPS) provides a mechanism where GPUs can be shared 
by multiple jobs, where each job is allocated some percentage of the GPU's 
resources. The total count of MPS resources available on a node should be 
configured in the slurm.conf file (e.g. "NodeName=tux[1-16] 
Gres=gpu:2,mps:200"). Several options are available for configuring MPS in the 
gres.conf file as listed below with examples following that:

No MPS configuration: The count of gres/mps elements defined in the slurm.conf 
will be evenly distributed across all GPUs configured on the node. For the 
example, "NodeName=tux[1-16] Gres=gpu:2,mps:200" will configure a count of 100 
gres/mps resources on each of the two GPUs."

Do I even need  to edit gres.conf? Can I just leave out AutoDetect=nvml?
CAUTION: This email originated from outside UTSW. Please be cautious of links 
or attachments, and validate the sender's email address before replying.

________________________________

UT Southwestern


Medical Center



The future of medicine, today.

Reply via email to