Yeah, I had that problem as well (trying to set up a partition that didn't have any nodes - they're not here yet).

I figured that one can have partitions with nodes that don't exist, though. As in, not even in DNS.

I currently have this:

[arc-slurm ~]$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
short        up   12:00:00      1  down* arc-c023
short        up   12:00:00      1  alloc arc-c001
short        up   12:00:00     43   idle arc-c[002-022,024-045]
medium       up 2-00:00:00      0    n/a
long*        up   infinite      0    n/a

with medium & long partition containing nodes 'arc-c[046-297]':

PartitionName=medium
   AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=NO QoS=N/A
   DefaultTime=12:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO    MaxNodes=UNLIMITED MaxTime=2-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=arc-c[046-297]...

which don't exist as of today:

[arc-slurm ~]$ host arc-c046
Host arc-c046 not found: 3(NXDOMAIN)

which - as you can see - simply ends up with SLURM showing the partition with no nodes.

So you could just put a dummy nodename in the slurm.conf file?

Tina


On 18/12/2020 11:13, Steve Brasier wrote:
Having tried just not even defining any partitions you hit this this <https://github.com/SchedMD/slurm/blob/master/src/common/node_conf.c#L383>check which seems to ensure you can't create a cluster with no nodes. Is it possible to create a control node without any compute nodes, e.g. as part of a staged deployment?

http://stackhpc.com/ <http://stackhpc.com/>
Please note I work Tuesday to Friday.


On Fri, 18 Dec 2020 at 10:56, Steve Brasier <ste...@stackhpc.com <mailto:ste...@stackhpc.com>> wrote:

    Hi all,

    According to the relevant manpage
    <https://slurm.schedmd.com/archive/slurm-20.02.5/slurm.conf.html>
    it's possible to define an empty partition using "Nodes= ".

    However this doesn't seem to work (slurm 20.2.05):

    [centos@testohpc-login-0 ~]$ grep -n Partition /etc/slurm/slurm.conf
    72:PriorityWeightPartition=1000
    105:PartitionName=compute Default=YES MaxTime=86400 State=UP Nodes=

    (note there is a space after that final "=" but I've tried both
    with and without)

    [centos@testohpc-login-0 ~]$ sinfo
    sinfo: error: Parse error in file /etc/slurm/slurm.conf line 105:
    " Nodes= "
    sinfo: fatal: Unable to process configuration file

    Is this a bug, or am I doing it wrong?

    thanks for any suggestions

    Steve

    http://stackhpc.com/ <http://stackhpc.com/>
    Please note I work Tuesday to Friday.


Reply via email to