Hello,
is there a best practise for activating this feature (set
ConstrainDevices=yes)? Do I have restart the slurmds? Does this affects running
jobs?
We are using Slurm 19.05.
Best,
Stefan
Am Dienstag, 25. August 2020, 17:24:41 CEST schrieb Christoph Brüning:
> Hello,
>
> we're using cgroup
Thanks Christoph and others for the help.
Turns out it is very simply setting cgroups that I had most of the way
set months ago and even left myself a note to uncomment
ConstrainDevices=yes in cgroup.conf when the GPU systems came online.
Kept racking my brain why the gres settings didn't include
Hello,
we're using cgroups to restrict access to the GPUs.
What I found particularly helpful, are the slides by Marshall Garey from
last year's Slurm User Group Meeting:
https://slurm.schedmd.com/SLUG19/cgroups_and_pam_slurm_adopt.pdf
(NVML didn't work for us for some reason I cannot recall, b
cgroups should work correctly _if_ you're not running with an old corrupted
slurm database.
There was a bug in a much earlier version of slurm that corrupted the
database in a way that the cgroups/accounting code could no longer fence
GPUs. This was fixed in a later version, but the database corru
Sorry about that. “NJT” should have read “but;” apparently my phone decided I
was talking about our local transit authority. 😓
On Aug 25, 2020, at 10:30, Ryan Novosielski wrote:
I believe that’s done via a QoS on the partition. Have a look at the docs
there, and I think “require” is a good k
I believe that’s done via a QoS on the partition. Have a look at the docs
there, and I think “require” is a good key word to look for.
Cgroups should also help with this, NJT I’ve been troubleshooting a problem
where that seems not to be working correctly.
--
|| \\UTGERS, |--
Hello,
I'm trying to restrict access to gpu resources on a cluster I maintain
for a research group. There are two nodes put into a partition with gres
gpu resources defined. User can access these resources by submitting
their job under the gpu partition and defining a gres=gpu.
When a user includ