Hi Kevin, On 11/4/20 6:00 pm, Kevin Buckley wrote:
In looking at the SlurmCtlD log we see pairs of lines as follows update_node: node nid00245 reason set to: slurm.conf update_node: node nid00245 state set to DRAINED
I'd go looking in your healthcheck scripts, I took a quick look at the source last night and couldn't see anything that looked related, and it's not a message I remember seeing before.
Also take a look in the slurmd logs on the node for that time, to see if there's anything that correlates there.
Good luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA