On 4/15/20 10:57 am, Dean Schulze wrote:
error: Munge decode failed: Invalid credential
ENCODED: Wed Dec 31 17:00:00 1969
DECODED: Wed Dec 31 17:00:00 1969
error: authentication: Invalid authentication credential
That's really interesting, I had one of these last week when on call,
fo
You might want to check the Munge section in my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#munge-authentication-service
/Ole
On 15-04-2020 19:57, Dean Schulze wrote:
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains abou
Thanks Erik.
Last night i made the changes.
i defined in slurm.conf on all the nodes as well as on the slurm server.
TmpFS=/lscratch
NodeName=node[01-10] CPUs=44 RealMemory=257380 Sockets=2
CoresPerSocket=22 ThreadsPerCore=1 TmpDisk=160 State=UNKNOWN
Feature=P4000 Gres=gpu:2
These nodes
Who owns the munge directory and key? Is it the right uid/gid? Is the munge
daemon running?
--
Sean Crosby | Senior DevOpsHPC Engineer and HPC Team Lead
Research Computing Services | Business Services
The University of Melbourne, Victoria 3010 Australia
On Thu, 16 Apr 2020 at 04:57, Dean Schulz
Hi Slurm-Users,
Hope this post finds all of you healthy and safe amidst the ongoing COVID19
craziness. We've got a strange error state that occurs when we enable
preemption and we need help diagnosing what is wrong. I'm not sure if we
are missing a default value or other necessary configuration, b
/etc/munge is 700
/etc/munge/munge.key is 400
On Wed, Apr 15, 2020 at 12:11 PM Riebs, Andy wrote:
> Two trivial things to check:
>
> 1. Permissions on /etc/munge and /etc/munge.key
>
> 2. Is munged running on the problem node?
>
>
>
> Andy
>
>
>
> *From:* slurm-users [mailto:slurm-
The default value for TmpDisk is 0, so if you want local scratch available on a
node, the amount of TmpDisk space must be defined in the node configuration in
slurm.conf.
example:
NodeName=TestNode01 CPUs=8 Boards=1 SocketsPerBoard=2 CoresPerSocket=4
ThreadsPerCore=1 RealMemory=24099 TmpDisk=1
I’d check ntp as your encoding time seems odd to me
On Wed, 15 Apr 2020 at 19:59, Dean Schulze wrote:
> I've installed two new nodes onto my slurm cluster. One node works, but
> the other one complains about an invalid credential for munge. I've
> verified that the munge.key is the same as on
Two trivial things to check:
1. Permissions on /etc/munge and /etc/munge.key
2. Is munged running on the problem node?
Andy
From: slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] On Behalf Of
Dean Schulze
Sent: Wednesday, April 15, 2020 1:57 PM
To: Slurm User Community Li
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains about an invalid credential for munge. I've
verified that the munge.key is the same as on all other nodes with
sudo cksum /etc/munge/munge.key
I recopied a munge.key from a node that works. I've ver
The more flexible way to do this is with QoS. (PreemptType=preempt/qos) You'll
need to have Accounting enabled and you'll probably want qos listed in
AccountingStorageEnforce. Once you do that you create a "shared" for the
scavenger jobs, a QoS for each group that buys into resources. Assign the
Dear all,
the Slurm official documentation say that: Trigger events are not processed
instantly, but a check is performed for trigger events on a periodic basis
(currently every 15 seconds).
https://slurm.schedmd.com/strigger.html
Is it possible to reduce this time?
Thank You.
12 matches
Mail list logo