Nodes are probably misconfigured in slurm.conf, yes. You can use the output of 
'slurmd -C' on a compute node to get started on what your NodeName entry in 
slurm.conf should be:


[root@node001 ~]# slurmd -C
NodeName=node001 CPUs=28 Boards=1 SocketsPerBoard=2 CoresPerSocket=14 
ThreadsPerCore=1 RealMemory=64333
UpTime=161-22:35:13

[root@node001 ~]# grep -i 'nodename=node\[001' /etc/slurm/slurm.conf
NodeName=node[001-022]  CoresPerSocket=14 RealMemory=62000 Sockets=2 
ThreadsPerCore=1 Weight=10201


Make sure that RealMemory in slurm.conf is no larger than what 'slurmd -C' 
reports. If I recall correctly, my slurm.conf settings are otherwise 
equivalent, but not word-for-word identical, with what 'slurmd -C' reports (I 
just specified sockets instead of both boards and socketsperboard, for example).

From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Mccall, 
Kurt E. (MSFC-EV41) <kurt.e.mcc...@nasa.gov>
Date: Friday, November 26, 2021 at 1:22 PM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: Re: [slurm-users] Reserving cores without immediately launching tasks 
on all of them

External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.

________________________________
Mike,

I’m working through your suggestions.   I tried

$ salloc –ntasks=20 --cpus-per-task=24 --verbose myscript.bash

but salloc says that the resources are not available:

salloc: defined options
salloc: -------------------- --------------------
salloc: cpus-per-task       : 24
salloc: ntasks              : 20
salloc: verbose             : 1
salloc: -------------------- --------------------
salloc: end of defined options
salloc: Linear node selection plugin loaded with argument 4
salloc: select/cons_res loaded with argument 4
salloc: Cray/Aries node selection plugin loaded
salloc: select/cons_tres loaded with argument 4
salloc: Granted job allocation 34299
srun: error: Unable to create step for job 34299: Requested node configuration 
is not available

$ scontrol show nodes  /* oddly says that there is one core per socket.  could 
our nodes be misconfigured? */

NodeName=n020 Arch=x86_64 CoresPerSocket=1
   CPUAlloc=0 CPUTot=24 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=n020 NodeHostName=n020 Version=20.02.3
   OS=Linux 4.18.0-305.7.1.el8_4.x86_64 #1 SMP Mon Jun 14 17:25:42 EDT 2021
   RealMemory=1 AllocMem=0 FreeMem=126431 Sockets=24 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal,low,high
   BootTime=2021-11-18T08:43:44 SlurmdStartTime=2021-11-18T08:44:31
   CfgTRES=cpu=24,mem=1M,billing=24
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s



From: slurm-users <slurm-users-boun...@lists.schedmd.com> On Behalf Of Renfro, 
Michael
Sent: Friday, November 26, 2021 8:15 AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [EXTERNAL] Re: [slurm-users] Reserving cores without immediately 
launching tasks on all of them

The end of the MPICH section at [1] shows an example using salloc [2].

Worst case, you should be able to use the output of “scontrol show hostnames” 
[3] and use that data to make mpiexec command parameters to run one rank per 
node, similar to what’s shown at the end of the synopsis section of [4].

[1] 
https://slurm.schedmd.com/mpi_guide.html#mpich2<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fmpi_guide.html%23mpich2&data=04%7C01%7Crenfro%40tntech.edu%7Cc9123a18a2934ad9e8a008d9b111b224%7C66fecaf83dc04d2cb8b8eff0ddea46f0%7C1%7C0%7C637735513496482886%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=rBK8XedubO1pmIa8dSHCCAnM713gruugH9pSamSvpX4%3D&reserved=0>
[2] 
https://slurm.schedmd.com/salloc.html<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsalloc.html&data=04%7C01%7Crenfro%40tntech.edu%7Cc9123a18a2934ad9e8a008d9b111b224%7C66fecaf83dc04d2cb8b8eff0ddea46f0%7C1%7C0%7C637735513496492881%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=oUelRHVc3ZrU0p9WMj9PiTNow9apx0Bc%2Fp0Kkg4aZic%3D&reserved=0>
[3] 
https://slurm.schedmd.com/scontrol.html<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fscontrol.html&data=04%7C01%7Crenfro%40tntech.edu%7Cc9123a18a2934ad9e8a008d9b111b224%7C66fecaf83dc04d2cb8b8eff0ddea46f0%7C1%7C0%7C637735513496502874%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=q6VP%2FZGZil%2Fj2uweeCis1Wz5z94gNpdnBB9k2ojbf9U%3D&reserved=0>
[4] 
https://www.mpich.org/static/docs/v3.1/www1/mpiexec.html<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mpich.org%2Fstatic%2Fdocs%2Fv3.1%2Fwww1%2Fmpiexec.html&data=04%7C01%7Crenfro%40tntech.edu%7Cc9123a18a2934ad9e8a008d9b111b224%7C66fecaf83dc04d2cb8b8eff0ddea46f0%7C1%7C0%7C637735513496512872%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xJmBSMC4I5Fodzbpg%2FrchbD8Ml6v3l0ZjoGHL5Hl3KE%3D&reserved=0>
--
Mike Renfro, PhD  / HPC Systems Administrator, Information Technology Services
931 372-3601<tel:931%20372-3601>      / Tennessee Tech University



On Nov 25, 2021, at 12:45 PM, Mccall, Kurt E. (MSFC-EV41) 
<kurt.e.mcc...@nasa.gov<mailto:kurt.e.mcc...@nasa.gov>> wrote:


External Email Warning

This email originated from outside the university. Please use caution when 
opening attachments, clicking links, or responding to requests.

________________________________
I want to launch an MPICH job with sbatch with one task per node (each a 
manager), while also reserving a certain number of cores on each node for the 
managers to fill up with spawned workers (via MPI_Comm_spawn).   I’d like to 
avoid using –exclusive.

I tried the arguments –ntasks=20 –cpus-per-task=24, but it appears that 20 * 24 
tasks will be launched.   Is there a way to reserve cores without immediately 
launching tasks on them?   Thanks for any help.

sbatch: defined options
sbatch: -------------------- --------------------
sbatch: cpus-per-task       : 24
sbatch: ignore-pbs          : set
sbatch: ntasks              : 20
sbatch: test-only           : set
sbatch: verbose             : 1
sbatch: -------------------- --------------------
sbatch: end of defined options
sbatch: Linear node selection plugin loaded with argument 4
sbatch: select/cons_res loaded with argument 4
sbatch: Cray/Aries node selection plugin loaded
sbatch: select/cons_tres loaded with argument 4
sbatch: Job 34274 to start at 2021-11-25T12:15:05 using 480 processors on nodes 
n[001-020] in partition normal

Reply via email to