Yeah, I had that problem as well (trying to set up a partition that
didn't have any nodes - they're not here yet).
I figured that one can have partitions with nodes that don't exist,
though. As in, not even in DNS.
I currently have this:
[arc-slurm ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
short up 12:00:00 1 down* arc-c023
short up 12:00:00 1 alloc arc-c001
short up 12:00:00 43 idle arc-c[002-022,024-045]
medium up 2-00:00:00 0 n/a
long* up infinite 0 n/a
with medium & long partition containing nodes 'arc-c[046-297]':
PartitionName=medium
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=12:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0
Hidden=NO
MaxNodes=UNLIMITED MaxTime=2-00:00:00 MinNodes=0 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=arc-c[046-297]...
which don't exist as of today:
[arc-slurm ~]$ host arc-c046
Host arc-c046 not found: 3(NXDOMAIN)
which - as you can see - simply ends up with SLURM showing the partition
with no nodes.
So you could just put a dummy nodename in the slurm.conf file?
Tina
On 18/12/2020 11:13, Steve Brasier wrote:
Having tried just not even defining any partitions you hit this this
<https://github.com/SchedMD/slurm/blob/master/src/common/node_conf.c#L383>check
which seems to ensure you can't create a cluster with no nodes. Is it
possible to create a control node without any compute nodes, e.g. as
part of a staged deployment?
http://stackhpc.com/ <http://stackhpc.com/>
Please note I work Tuesday to Friday.
On Fri, 18 Dec 2020 at 10:56, Steve Brasier <ste...@stackhpc.com
<mailto:ste...@stackhpc.com>> wrote:
Hi all,
According to the relevant manpage
<https://slurm.schedmd.com/archive/slurm-20.02.5/slurm.conf.html>
it's possible to define an empty partition using "Nodes= ".
However this doesn't seem to work (slurm 20.2.05):
[centos@testohpc-login-0 ~]$ grep -n Partition /etc/slurm/slurm.conf
72:PriorityWeightPartition=1000
105:PartitionName=compute Default=YES MaxTime=86400 State=UP Nodes=
(note there is a space after that final "=" but I've tried both
with and without)
[centos@testohpc-login-0 ~]$ sinfo
sinfo: error: Parse error in file /etc/slurm/slurm.conf line 105:
" Nodes= "
sinfo: fatal: Unable to process configuration file
Is this a bug, or am I doing it wrong?
thanks for any suggestions
Steve
http://stackhpc.com/ <http://stackhpc.com/>
Please note I work Tuesday to Friday.