Hi Doug,
Slurm has the strigger[1] mechanism that can do exactly that, the
manpage even has your use case as an example. It works quite well for us.
Best,
Marcus
[1] https://slurm.schedmd.com/strigger.html
On 26.06.21 19:10, Doug Niven wrote:
Hi Folks,
I’d like to setup an email notificati
Hi Folks,
I’d like to setup an email notification, perhaps via cron (unless there’s a
better method) of notifying the sysadmin when a Slurm node is down and/or not
firing off jobs...
For example, using ‘squeue’ in NODELIST(REASON) I recently saw:
(Nodes required for job are DOWN, DRAINED or re