Just in case, increase Slurmdtimeout in slurm.conf (so that when the controller is back, it will give you time to fix the issues with the communication between slurmd and slurmctld - if there will be any). Otherwise it should not affect running and pending jobs. First stop controller, then slurmdbd. And then when the disk arrangements are done, first start slurmdbd and then slurmctld.

Cheers,

Barbara

On 6/24/21 12:54 AM, Amjad Syed wrote:
Hello all
We have  a cluster  running centos  7 . Our slurm  scheduler is running on a vm  machine and  we are running out  of disk  space for /var  The slurm innodb is taking most of space.  We intend to expand the vdisk for slurm server. This will require a reboot  for changes to take  effect.  Do we have to stop users  submitting  jobs by draining all partitions and then restart the server. That is slurmctld.slurmdb and mariadb? Or  will the restarting of slurm vm have  no effect on running/pending iobs?

Sincerely

Amjad

Reply via email to