We are pleased to announce the availability of Slurm version 18.08.7.
This includes over 20 fixes since 18.08.6 was released last month,
include one for a regression that caused issues with 'sacct -J' not
returning results correctly.
Slurm can be downloaded from https://www.schedmd.com/downlo
On 4/11/19 8:27 AM, Randall Radmer wrote:
I guess my next question is, are there any negative repercussions to
setting "Delegate=yes" in slurmd.service?
This was Slurm bug 5292 and was fixed last year:
https://bugs.schedmd.com/show_bug.cgi?id=5292
# Commit cecb39ff087731d2 adds Delegate=yes
I guess my next question is, are there any negative repercussions to
setting "Delegate=yes" in slurmd.service?
On Thu, Apr 11, 2019 at 8:21 AM Marcus Wagner
wrote:
> I assume without Delegate=yes this would happen also to regular jobs,
> which means, nightly updates could "destroy" the cgroups c
I assume without Delegate=yes this would happen also to regular jobs,
which means, nightly updates could "destroy" the cgroups created by
slurm and therefore let the jobs out "into the wild".
Best
Marcus
P.S.:
We had a similar problem with LSF
On 4/11/19 3:58 PM, Randall Radmer wrote:
Yes, I
Yes, I was just testing that. Adding "Delegate=yes" seems to fix the
problem (see below), but wanted to try a few more things before saying
anything.
[computelab-136:~]$ grep ^Delegate /etc/systemd/system/slurmd.service
Delegate=yes
[computelab-136:~]$ nvidia-smi --query-gpu=index,name --format=c
Hi Randall,
could you please for a test add the following lines to the service part
of the slurmd.service file (or add an override file).
Delegate=yes
Best
Marcus
On 4/11/19 3:11 PM, Randall Radmer wrote:
It's now distressingly simple to reproduce this, based on Kilinan's
clue (off topic
Thanks Luca! I didn't know about these commands.
On Thu, Apr 11, 2019 at 1:53 AM Luca Capello wrote:
> Hi there,
>
> On 4/10/19 11:53 PM, Kilian Cavalotti wrote:
> > As far as I can tell, it looks like this is probably systemd messing
> > up with cgroups and deciding it's the king of cgroups on
It's now distressingly simple to reproduce this, based on Kilinan's clue
(off topic, "Kilinan's Clue" sounds like a good title for a Hardy Boys
Mystery Story).
After limited testing, seems to me that running "systemctl
daemon-reload" followed by "systemctl restart slurmd" breaks it. See
below:
Thanks Kilian! I'll look at this today.
-Randy
On Wed, Apr 10, 2019 at 3:59 PM Kilian Cavalotti <
kilian.cavalotti.w...@gmail.com> wrote:
> Hi Randy!
>
> > We have a slurm cluster with a number of nodes, some of which have more
> than one GPU. Users select how many or which GPUs they want with
Hi there,
On 4/10/19 11:53 PM, Kilian Cavalotti wrote:
> As far as I can tell, it looks like this is probably systemd messing
> up with cgroups and deciding it's the king of cgroups on the host.
FYI, given that I found no mention of those tools, `systemd-cgls` et
`systemd-cgtop` help when debuggi
10 matches
Mail list logo