mostly, our problem was, that we forgot to add/remove a node to/from the partitions/topology file, which caused slurmctld to deny startup. So I wrote a simple checker for that. Here is the output of a sample run:
reading '../conf/rcc/slurm.conf' ... reading '../conf/rcc/nodes.conf' ... reading '../conf/rcc/partitions.conf' ... reading '../conf/rcc/topology.conf' ... reading '../conf/rcc/gres.conf' ... [OK]: All nodeweights are correct. [OK]: All nodes are defined only once. [OK]: All nodes are used in partitions. [OK]: There are no nonexisting nodes in the partitions. [OK]: No nodes seen more than once in topology file. [OK]: There are no nodes missing in topology.conf [OK]: All nodes in topology.conf exist in slurm.conf WARNING: GRES checking not yet implemented. If someone is interested ... Best Marcus Am 13.10.2021 um 15:36 schrieb Paul Edmon:
Sadly no. There is a feature request for one though: https://bugs.schedmd.com/show_bug.cgi?id=3435 What we've done in the meantime is put together a gitlab runner which basically starts up a mini instance of the scheduler and runs slurmctld on the slurm.conf we want to put in place. We then have it reject any changes that cause failure. It's not perfect but it works. A real syntax checker would be better. -Paul Edmon- On 10/12/2021 4:08 PM, bbenede...@goodyear.com wrote:Is there any sort of syntax checker that we could run our slurm.conf file through before committing it? (And sometimes crashing slurmctld in the process...) Thanks!
-- Dipl.-Inf. Marcus Wagner IT Center Gruppe: Server, Storage, HPC Abteilung: Systeme und Betrieb RWTH Aachen University Seffenter Weg 23 52074 Aachen Tel: +49 241 80-24383 Fax: +49 241 80-624383 wag...@itc.rwth-aachen.de www.itc.rwth-aachen.de Social Media Kanäle des IT Centers: https://blog.rwth-aachen.de/itc/ https://www.facebook.com/itcenterrwth https://www.linkedin.com/company/itcenterrwth https://twitter.com/ITCenterRWTH https://www.youtube.com/channel/UCKKDJJukeRwO0LP-ac8x8rQ
smime.p7s
Description: S/MIME Cryptographic Signature