Nodes are probably misconfigured in slurm.conf, yes. You can use the output of
'slurmd -C' on a compute node to get started on what your NodeName entry in
slurm.conf should be:
[root@node001 ~]# slurmd -C
NodeName=node001 CPUs=28 Boards=1 SocketsPerBoard=2 CoresPerSocket=14
ThreadsPerCore=1 Re
Mike,
I’m working through your suggestions. I tried
$ salloc –ntasks=20 --cpus-per-task=24 --verbose myscript.bash
but salloc says that the resources are not available:
salloc: defined options
salloc:
salloc: cpus-per-task : 24
salloc: ntasks
The end of the MPICH section at [1] shows an example using salloc [2].
Worst case, you should be able to use the output of “scontrol show hostnames”
[3] and use that data to make mpiexec command parameters to run one rank per
node, similar to what’s shown at the end of the synopsis section of [4