On 23-12-2023 05:09, Jeffrey Tunison wrote:
Is there a straightforward way to create a batch job that runs once on
every node in the cluster?
A technique simpler than generating a list from sinfo output and
dispatching the job in a for loop for the N nodes.
That’s not very hard, but I thought there might be an elegant solution
which would make dispatching maintenance jobs easier.
One solution is the method in this script:
https://github.com/OleHolmNielsen/Slurm_tools/blob/master/nodes/update.sh
This works very reliably for us when we need to apply OS or firmware
updates.
SLURM 22.05.09
Note: You should apply the recent Slurm security updates ASAP!
/Ole