Good Morning (at least for those on the West coast of the US) My nodes are no longer “down”
eric@radoncmaster:~$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST debug* up infinite 4 idle radonc[01-04] I think the NTP configuration did the trick So one possibility there is that the clocks are out of step between the nodes. Usually that's configured via NTP to have a common reference source. after ntp configuration i had to reboot the nodes, restarted and enabled ntp. restarted slurmd on all execute nodes and slurmctl on headnote/master then i ran: scontrol update nodename=radonc[01-04] state=UNDRAIN scontrol update nodename=radonc[01-04] state=IDLE All seem good for now Thank you everyone for your help. I learned a lot through everyone’s comments, tips and advice. I look forward my post-docs to run their jobs. I am certain that i will have more questions by then. Again, I greatly appreciate everyone’s help. Cheers, Eric _____________________________________________________________________________________________________ Eric F. Alemany System Administrator for Research Division of Radiation & Cancer Biology Department of Radiation Oncology Stanford University School of Medicine Stanford, California 94305 Tel:1-650-498-7969<tel:1-650-498-7969> No Texting Fax:1-650-723-7382<tel:1-650-723-7382> On May 7, 2018, at 5:30 PM, Chris Samuel <ch...@csamuel.org<mailto:ch...@csamuel.org>> wrote: On Tuesday, 8 May 2018 9:40:53 AM AEST Eric F. Alemany wrote: I followed the link as well as the instruction on “Securing the installation” and “Testing the installation” Great. The only thing that i am not able to do is: Check if a credential can be remotely decoded So one possibility there is that the clocks are out of step between the nodes. Usually that's configured via NTP to have a common reference source. That's pretty standard as if you're running an HPC system with a distributed filesystem like GPFS or Lustre then you need the clocks in lockstep for it to function properly. Good luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC