20230130 21:25:14 [slurm] /usr/libexec/nhc/node-mark-online
mcn26.chicagobooth.edu <http://mcn26.chicagobooth.edu>
/usr/libexec/nhc/node-mark-online: Not sure how to handle node state ""
on mcn26.chicagobooth.edu <http://mcn26.chicagobooth.edu>
/usr/libexec/nhc/node-mark-online:
Hi Jim,
Maybe you'll find these Wiki pages relevant for setting up your Slurm
database:
https://wiki.fysik.dtu.dk/Niflheim_system/
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_database/
/Ole
On 1/30/23 20:43, Jim Klo wrote:
I’ve been working on updating our small slurm cluster over the la
Hi,
Currently, some of our nodes are overloaded. The nhc installed used to
check the load and drain the node when it is overloaded. However, for the
past few days, it is not showing the state of the node. When I run
/usr/sbin/nhc manually, it says
20230130 21:25:14 [slurm] /usr/libexec/nhc/node
Greetings,
I’ve been working on updating our small slurm cluster over the last few days.
I’ve successfully updated the cluster. However our cluster is missing the
slurmdbd configuration, and while I know it’s not required, I would like to add
that as it would be helpful to access job history d