We currently have a test cluster and a production cluster, all on the same 
network.  We try things on the test cluster, and then we gather those changes 
and make a change to the production cluster.  We're doing that through two 
different repos, but we'd like to have a single repo to make the transition 
from testing configs to publishing them more seamless.  The problem is, of 
course, that the test cluster and production clusters have different cluster 
names, as well as different nodes within them.

Using the include directive, I can pull all of the NodeName lines out of 
slurm.conf and put them into %c-nodes.conf files, one for production, one for 
test.  That still leaves me with two problems:

  *   The clustername itself will still be a problem.  I WANT the same 
slurm.conf file between test and production...but the clustername line will be 
different for them both.  Can I use an env var in that cluster name, because on 
production there could be a different env var value than on test?
  *   The gres.conf file.  I tried using the same "include" trick that works on 
slurm.conf, but it failed because it did not know what the "ClusterName" was.  
I think that means that either it doesn't work for anything other than 
slurm.conf, or that the clustername will have to be defined in gres.conf as 
well?

Any other suggestions of how to keep our slurm files in a single source control 
repo, but still have the flexibility to have them run elegantly on either test 
or production systems?

Thanks.

Reply via email to