Make sure you have configured the RebootProgram in slurm.conf, that it
exists on the nodes and is executable by the user.
This is usually /sbin/reboot
Brian Andrus
On 6/7/2023 7:50 AM, Heinz, Michael wrote:
Hey, all.
So I added slurmdbd to our slurm-23.02 install and made my account an
admin, but when I try to do a srun with --reboot it literally just
sits forever, no errors, nothing in the logs, it just sits with the
node in “CF” state until I cancel the job, set the node to down and
back to idle again.
I tried setting RebootProgram to a script that just writes to a file
in /tmp but the program never runs.
Any suggestions?
Michael Heinz
End-to-End Network Software Engineer
michael.he...@intel.com