> I thought setting partitions to DOWN will kill jobs?
nn, it just avoids starting new jobs from the job queue in given partition.

josef

On 24. 06. 21 11:26, Tina Friedrich wrote:
I thought setting partitions to DOWN will kill jobs?

Amjad - to my experience, the slurmdbd & slurmctld server can be rebooted with no effect on running jobs. You can't submit whilst it's down, and I'm not precisely sure what happens to jobs that are just finishing - but really the impact should be minimal.

(I've done exactly what you're needing to do - reboot so a change in disk size is picked up - at least once with the cluster running.)

It is absolutely safe to restart slurmctld (and slurmdbd) with jobs running on the cluster, that really is something that at least I do all the time.

Tina

On 24/06/2021 10:16, Josef Dvoracek wrote:
hi,

just set the partitions to "DOWN" to avoid unexpected behavior for users and reboot slurm(ctl|dbd)+sql box. Running jobs are from my experience not affected.
No need to drain nodes.

josef

On 24. 06. 21 0:54, Amjad Syed wrote:
Hello all
We have  a cluster  running centos  7 . Our slurm  scheduler is running on a vm  machine and  we are running out  of disk space for /var  The slurm innodb is taking most of space.  We intend to expand the vdisk for slurm server. This will require a reboot for changes to take  effect.  Do we have to stop users submitting  jobs by draining all partitions and then restart the server. That is slurmctld.slurmdb and mariadb? Or  will the restarting of slurm vm have  no effect on running/pending iobs?

Sincerely

Amjad


--
Josef Dvoracek
Institute of Physics | Czech Academy of Sciences
cell: +420 608 563 558 | https://telegram.me/jose_d | FZU phone nr. : 2669


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to