Re: [slurm-users] Restart Job after sudden reboot of the node

2020-07-24 Thread Christopher Samuel
On 7/24/20 12:28 pm, Saikat Roy wrote: If SLURM restarts automatically, is there any way to stop it? If you would rather Slurm not start scheduling jobs when it is restarted then you can set your partitions to have `State=DOWN` in slurm.conf. That way should the node running slurmctld reboo

Re: [slurm-users] Restart Job after sudden reboot of the node

2020-07-24 Thread Steven Dick
Both See man sbatch, --requeue The default is to not requeue (unless it was changed in slurm.conf) and your job anc check $SLURM_RESTART_COUNT to see if it has been restarted. This is handy if your job can checkpoint / restart. On Fri, Jul 24, 2020 at 3:33 PM Saikat Roy wrote: > Hello, > > I

[slurm-users] Restart Job after sudden reboot of the node

2020-07-24 Thread Saikat Roy
Hello, I have recently installed SLURM in our ubuntu cluster. I have one doubt that if the system somehow automatically restarts due to power failure what will happen to the running jobs. Are they going to resume automatically or we have to restart manually? If SLURM restarts automatically, is th