Hi Kevin,I have no experience with version 20 of slurm, but probably you have some misconfiguration.
Have you changed any settings in your slurm.conf file after the upgrade?Dive into the documentation and verify if there aren't any changes to some of the directives within the slurm.conf.
On 05/11/2020 02:00, Kevin Buckley wrote:
We have had a couple of nodes enter a DRAINED state where scontrol gives the reason as Reason=slurm.conf In looking at the SlurmCtlD log we see pairs of lines as follows update_node: node nid00245 reason set to: slurm.conf update_node: node nid00245 state set to DRAINED A search of the interweb thing for "Reason=slurm.conf" and "reason set to: slurm.conf" has so far proved too much for my search-fu. The slurm.conf files do match across the Cray nodes so it seems unlikely that it would be a "something out of sync " thing. Other "update_node" actions seen in the logs , eg, "reason set to: NHC-Admindown" also put the node into a DRAINED state but then, that's to be expected. We upgraded to 20.02.5 a couple of days ago and all of the occurences of the "reason set to: slurm.conf" message only appear after the update. Anyone else seen anything like this, espcially any of you who have just gone 20.02.5? Yours, diving into the source soon, Kevin
-- *Cumprimentos / Best Regards,* Zacarias Benta INCD @ LIP - Universidade do Minho INCD Logo
smime.p7s
Description: S/MIME Cryptographic Signature