, April 17, 2020 1:58 PM
To: Slurm User Community List
Subject: Re: [slurm-users] Munge decode failing on new node
A couple of quick checks to see if the problem is munge:
1. On the problem node, try
$ echo foo | munge | unmunge
2. If (1) works, try this from the node running slurmctld
I went through the exercise of making the other user the same on the
slurmctld as on the slurmd nodes, but that had no effect. I still have 3
nodes that have connectivity and one node where slurmd cannot contact
slurmctld. That node has ssh connectivity to and from slurmctld node, but
no slurm co
Hi Dean,
On Wed, Apr 22, 2020 at 07:28:15PM -0600, dean.w.schu...@gmail.com wrote:
> Even for users other than slurm and munge? It seems strange that 3 of
> 4 worker nodes work with the same UIDs/GIDs as the non-working nodes.
As in:
https://slurm.schedmd.com/quickstart_admin.html
Super Quick
Subject: Re: [slurm-users] Munge decode failing on new node
On 4/22/20 12:56 PM, dean.w.schu...@gmail.com wrote:
> There is a third user account on all machines in the cluster that is
> the user account for using the cluster. That account has uid 1000 on
> all four worker nodes, b
On 4/22/20 12:56 PM, dean.w.schu...@gmail.com wrote:
There is a third user account on all machines in the cluster that is the
user account for using the cluster. That account has uid 1000 on all four
worker nodes, but on the controller it is 1001. So that is probably why the
question marks.
work have the same uid
mismatch for that user (nor the slurm or munge user).
-Original Message-
From: slurm-users On Behalf Of Chris
Samuel
Sent: Monday, April 20, 2020 12:03 AM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Munge decode failing on new node
On Friday, 17 April
this one is going nowhere.
From: slurm-users On Behalf Of Brian
Andrus
Sent: Sunday, April 19, 2020 9:30 AM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Munge decode failing on new node
I see potentially 2 things you should likely do:
1. Run ntpd on your nodes. You can
On Friday, 17 April 2020 2:22:00 PM PDT Dean Schulze wrote:
> Both work. The only discrepancy is that the slurm controller output had
> these two lines:
>
> UID: ??? (1000)
> GID: ??? (1000)
>
> Like the controller doesn't know the username for UID 1000.
What does thi
, 2020 3:40 PM
*To:* Slurm User Community List mailto:slurm-users@lists.schedmd.com>>
*Subject:* Re: [slurm-users] Munge decode failing on new node
There is no ntp service running on any of my nodes, and all but
this one is working. I haven't heard that ntp is a require
2020 3:40 PM
> *To:* Slurm User Community List
> *Subject:* Re: [slurm-users] Munge decode failing on new node
>
>
>
> There is no ntp service running on any of my nodes, and all but this one
> is working. I haven't heard that ntp is a requirement for slurm, just that
>
e
>
>
>
> *From:* slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] *On
> Behalf Of *Dean Schulze
> *Sent:* Friday, April 17, 2020 3:40 PM
> *To:* Slurm User Community List
> *Subject:* Re: [slurm-users] Munge decode failing on new node
>
>
>
> There is no ntp serv
...@lists.schedmd.com] On Behalf Of
Dean Schulze
Sent: Friday, April 17, 2020 3:40 PM
To: Slurm User Community List
Subject: Re: [slurm-users] Munge decode failing on new node
There is no ntp service running on any of my nodes, and all but this one is
working. I haven't heard that ntp
There is no ntp service running on any of my nodes, and all but this one is
working. I haven't heard that ntp is a requirement for slurm, just that
the time be synchronized across the cluster. And it is.
On Wed, Apr 15, 2020 at 12:17 PM Carlos Fenoy wrote:
> I’d check ntp as your encoding time
On 4/15/20 10:57 am, Dean Schulze wrote:
error: Munge decode failed: Invalid credential
ENCODED: Wed Dec 31 17:00:00 1969
DECODED: Wed Dec 31 17:00:00 1969
error: authentication: Invalid authentication credential
That's really interesting, I had one of these last week when on call,
fo
You might want to check the Munge section in my Slurm Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#munge-authentication-service
/Ole
On 15-04-2020 19:57, Dean Schulze wrote:
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains abou
> *From:* slurm-users [mailto:slurm-users-boun...@lists.schedmd.com] *On
> Behalf Of *Dean Schulze
> *Sent:* Wednesday, April 15, 2020 1:57 PM
> *To:* Slurm User Community List
> *Subject:* [slurm-users] Munge decode failing on new node
>
>
>
> I've installed two n
I’d check ntp as your encoding time seems odd to me
On Wed, 15 Apr 2020 at 19:59, Dean Schulze wrote:
> I've installed two new nodes onto my slurm cluster. One node works, but
> the other one complains about an invalid credential for munge. I've
> verified that the munge.key is the same as on
List
Subject: [slurm-users] Munge decode failing on new node
I've installed two new nodes onto my slurm cluster. One node works, but the
other one complains about an invalid credential for munge. I've verified that
the munge.key is the same as on all other nodes with
sudo cksum
I've installed two new nodes onto my slurm cluster. One node works, but
the other one complains about an invalid credential for munge. I've
verified that the munge.key is the same as on all other nodes with
sudo cksum /etc/munge/munge.key
I recopied a munge.key from a node that works. I've ver
19 matches
Mail list logo