On Thursday, 15 March 2018 6:04:47 PM AEDT Arie Blumenzweig wrote: > # sinfo > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > cloud* up infinite 1 down* slurm-node0
It looks like Slurm thinks the node was booted, but cannot talk to it. > [2018-03-13T15:38:21.401] debug2: Error connecting slurm stream socket at > 172.31.38.99:6818: Connection timed out Did it possibly boot with that IP address but slurmd was blocked by a firewall? I've not played with the cloud stuff for a long time but you may need to try: scontrol update node=slurm-node0 state=POWER_DOWN to see if that gets it back into its offline state properly to allow it to try and by booted again. Good luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC