I need to update the configuration for the nodes in a cluster and I’d like to 
let jobs keep running while I do so. Specifically I need to add 
RealMemory=<blah> to the node definitions (NodeName=). Is it safe to do this 
for nodes where jobs are currently running? Or I need to make sure nodes are 
drained while updating their config? We are using SelectType=select/linear on 
this cluster. Users would only be allocating complete nodes.

Additionally, do I need to restart the Slurm daemons (slurmctld and slurmd) to 
make this change? I understand if I were adding completely new nodes I would 
need to do so (and that it’s advised to stop slurmctld, update config files, 
restart slurmd on all computes, and then start slurmctld). But is restarting 
the Slurm daemons also required when updating node config as I would like to 
do, or would ‘scontrol reconfigure’ suffice?

Reply via email to