It looks like you don't have the munged daemon running.
On 11/29/2017 08:01 AM, Bruno Santos wrote:
Hi everyone,
I have set-up slurm to use slurm_db and all was working fine. However
I had to change the slurm.conf to play with user priority and upon
restarting the slurmctl is fails with the following messages below. It
seems that somehow is trying to use the mysql password as a munge socket?
Any idea how to solve it?
Nov 29 12:56:30 plantae slurmctld[29613]: Registering slurmctld at
port 6817 with slurmdbd.
Nov 29 12:56:32 plantae slurmctld[29613]: error: If munged is up,
restart with --num-threads=10
Nov 29 12:56:32 plantae slurmctld[29613]: error: Munge encode
failed: Failed to access "magic": No such file or directory
Nov 29 12:56:32 plantae slurmctld[29613]: error: authentication:
Socket communication error
Nov 29 12:56:32 plantae slurmctld[29613]: error:
slurm_persist_conn_open: failed to send persistent connection init
message to localhost:6819
Nov 29 12:56:32 plantae slurmctld[29613]: error: slurmdbd: Sending
PersistInit msg: Protocol authentication error
Nov 29 12:56:34 plantae slurmctld[29613]: error: If munged is up,
restart with --num-threads=10
Nov 29 12:56:34 plantae slurmctld[29613]: error: Munge encode
failed: Failed to access "magic": No such file or directory
Nov 29 12:56:34 plantae slurmctld[29613]: error: authentication:
Socket communication error
Nov 29 12:56:34 plantae slurmctld[29613]: error:
slurm_persist_conn_open: failed to send persistent connection init
message to localhost:6819
Nov 29 12:56:34 plantae slurmctld[29613]: error: slurmdbd: Sending
PersistInit msg: Protocol authentication error
Nov 29 12:56:36 plantae slurmctld[29613]: error: If munged is up,
restart with --num-threads=10
Nov 29 12:56:36 plantae slurmctld[29613]: error: Munge encode
failed: Failed to access "magic": No such file or directory
Nov 29 12:56:36 plantae slurmctld[29613]: error: authentication:
Socket communication error
Nov 29 12:56:36 plantae slurmctld[29613]: error:
slurm_persist_conn_open: failed to send persistent connection init
message to localhost:6819
Nov 29 12:56:36 plantae slurmctld[29613]: error: slurmdbd: Sending
PersistInit msg: Protocol authentication error
Nov 29 12:56:36 plantae slurmctld[29613]: fatal: It appears you
don't have any association data from your database. The
priority/multifactor plugin requires this information to run
correctly. Please check your database connection and try again.
Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Main
process exited, code=exited, status=1/FAILURE
Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Unit
entered failed state.
Nov 29 12:56:36 plantae systemd[1]: slurmctld.service: Failed with
result 'exit-code'.