Re: [slurm-users] [External] ERROR: slurmctld: auth/munge: _print_cred: DECODED

2022-12-01 Thread Ole Holm Nielsen
Hi Nousheen, It seems that you have configured incorrectly the nodes in slurm.conf. I notice this: RealMemory=1 This means 1 Megabyte of RAM memory, we only had this with IBM PCs back in the 1980ies :-) See how to configure nodes in https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_confi

Re: [slurm-users] slurm_persist_conn_open_without_init: failed to open persistent connection to host

2022-12-01 Thread Sushil Mishra
Many thanks, William. That may have been the issue. I changed the hostname to FQDN and "StorageHost=localhost" and now it seems to try connecting to the database. [root@mannose sushil]# cat /var/log/slurm/slurmctld.log [2022-12-01T15:26:50.942] Job accounting information stored, but details not g

Re: [slurm-users] [External] ERROR: slurmctld: auth/munge: _print_cred: DECODED

2022-12-01 Thread Nousheen
Dear Robbert, Thankyou so much for your response. I was so focused on sync of time that I missed the date on one of the nodes which was 1 day behind as you said. I have corrected it and now i get the following output in status. *(base) [nousheen@nousheen slurm]$ systemctl status slurmctld.service

Re: [slurm-users] [External] ERROR: slurmctld: auth/munge: _print_cred: DECODED

2022-12-01 Thread Michael Robbert
I believe that the error you need to pay attention to for this issue is this line: Dec 01 16:17:19 nousheen slurmctld[1631]: slurmctld: error: Check for out of sync clocks It looks like your compute nodes clock is a full day ahead of your controller node. Dec. 2 instead of Dec. 1. The clo

Re: [slurm-users] Licenses: Remote vs Reservation

2022-12-01 Thread Richard Ems
Hi Brian, Thanks for your answer and sorry for my late reply. Yes, the cluster is not large, perhaps medium sized? What is considered for Slurm a small/medium/large cluster? I think I would go for the remote license method, to avoid what you mentioned about keeping the slurm.conf consistent on al

[slurm-users] ERROR: slurmctld: auth/munge: _print_cred: DECODED

2022-12-01 Thread Nousheen
Hello Everyone, I am using slurm version 21.08.5 and Centos 7. I successfully start slurmd on all compute nodes but when I start slurmctld on server node it gives the following error: *(base) [nousheen@nousheen ~]$ systemctl status slurmctld.service -l* ● slurmctld.service - Slurm controller da