Cristóba,
Your approach is a little off.
Slurm needs to know about the node properties. It can then allocate them
based on job/partition.
So, you should have a single "NodeName" entry for the node the
accurately describes what you want to allow access to at all.
Then you limit what is allowed to be requested in the partition
definition and/or a QOS (if you are using accounting).
Brian Andrus
On 5/7/2021 8:11 PM, Cristóbal Navarro wrote:
Hi community,
I am unable to tell if SLURM is handling the following situation
efficiently in terms of CPU affinities at each partition.
Here we have a very small cluster with just one GPU node with 8x GPUs,
that offers two partitions --> "gpu" and "cpu".
Part of the Config File
## Nodes list
## use native GPUs
NodeName=nodeGPU01 SocketsPerBoard=8 CoresPerSocket=16
ThreadsPerCore=1 RealMemory=1024000 State=UNKNOWN Gres=gpu:A100:8
Feature=gpu
## Default CPU layout (same total cores as others)
#NodeName=nodeGPU01 SocketsPerBoard=8 CoresPerSocket=16
ThreadsPerCore=1 RealMemory=1024000 State=UNKNOWN
Gres=gpu:a100:4,gpu:a100_20g:2,gpu:a100_10g:2,gpu:a100_5g:16
Feature=ht,gpu
## Partitions list
PartitionName=gpu OverSubscribe=FORCE MaxCPUsPerNode=64 DefCpuPerGPU=8
DefMemPerGPU=65556 MaxTime=1-00:00:00 State=UP Nodes=nodeGPU01
Default=YES
PartitionName=cpu OverSubscribe=FORCE MaxCPUsPerNode=64
DefMemPerNode=16384 MaxTime=1-00:00:00 State=UP Nodes=nodeGPU01
The node has 128 cpu cores (2x 64 core AMD cpus, SMT disabled) and
resources have been subdivided from the partition options, 64 maxCores
for each one.
The gres file is auto-generated with nvml, at it obeys the following
GPU topology (focus on CPU affinity) shown ahead
➜ ~ nvidia-smi topo -m
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 mlx5_0 mlx5_1 mlx5_2 mlx5_3
mlx5_4 mlx5_5 mlx5_6 mlx5_7 mlx5_8 mlx5_9 CPU Affinity NUMA Affinity
GPU0 X NV12 NV12 NV12 NV12 NV12 NV12 NV12 PXB PXB SYS SYS SYS SYS SYS
SYS SYS SYS 48-63 3
GPU1 NV12 X NV12 NV12 NV12 NV12 NV12 NV12 PXB PXB SYS SYS SYS SYS SYS
SYS SYS SYS 48-63 3
GPU2 NV12 NV12 X NV12 NV12 NV12 NV12 NV12 SYS SYS PXB PXB SYS SYS SYS
SYS SYS SYS 16-31 1
GPU3 NV12 NV12 NV12 X NV12 NV12 NV12 NV12 SYS SYS PXB PXB SYS SYS SYS
SYS SYS SYS 16-31 1
GPU4 NV12 NV12 NV12 NV12 X NV12 NV12 NV12 SYS SYS SYS SYS PXB PXB SYS
SYS SYS SYS 112-127 7
GPU5 NV12 NV12 NV12 NV12 NV12 X NV12 NV12 SYS SYS SYS SYS PXB PXB SYS
SYS SYS SYS 112-127 7
GPU6 NV12 NV12 NV12 NV12 NV12 NV12 X NV12 SYS SYS SYS SYS SYS SYS PXB
PXB SYS SYS 80-95 5
GPU7 NV12 NV12 NV12 NV12 NV12 NV12 NV12 X SYS SYS SYS SYS SYS SYS PXB
PXB SYS SYS 80-95 5
If we look closely, we can see specific CPU affinities for the GPUs,
therefore I assume that the multi-core CPU jobs should use the 64 CPU
cores that are not listed here, e.g, cores 0-15, 32-47....
Will SLURM realize that CPU jobs should have this core affinity? if
not, is there a way I can make the default CPU affinities the correct
ones for all JOBs launched on the "cpu" partition?
Any help is welcome
--
Cristóbal A. Navarro