Hi Dean,
first make sure, the munge.key is really the same on all systems. Also
the users must be the same on the systems, as the submission itself is
done on the controller. Please be sure also, that the systems have the
same date and time.
After that, restart munge service and then the slurm services.
Best
Marcus
On 12/16/19 9:58 PM, Dean Schulze wrote:
I have my controller running (slurmctld and slrumdbd) and my
controller and node host can ping each other by name so they resolve
via /etc/hosts settings. When I try to start the slurmd.service it
shows that it is active (running), but gives these errors:
Unable to register: Zero Bytes were transmitted or received
The controller shows this from slurmctld.service:
Munge decode failed: Invalid credential
I copied the munge.key from controller to node (copying via an NFS
shared directory required changing ownership and permissions and then
changing them back).
Apparently the node is communicating with the controller, but munge
thinks I have a bad credential.
Any idea how to troubleshoot this?
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wag...@itc.rwth-aachen.de
www.itc.rwth-aachen.de