Hello,

I have compiled SLURM-24.11.3 and I have configured two GPUs in my system 
(slurmctld and slurmd are running in the same computer). Computes has a old 
processor Intel i7 with 4 cores and 4 hyperthreading. Node is configured with 
"NodeName=mysystem CPUs=8 Boards=1 SocketsPerBoard=1 CoresPerSocket=4 
ThreadsPerCore=2 RealMemory=7940 
Gres=gpu:geforce_gtx_titan_x:1,gpu:geforce_gtx_titan_black:1". "lscpu" command 
returns:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              8
On-line CPU(s) list: 0-7
Thread(s) per core:  2
Core(s) per socket:  4
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
BIOS Vendor ID:      Intel(R) Corporation
CPU family:          6
Model:               26
Model name:          Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz
BIOS Model name:     Intel(R) Core(TM) i7 CPU         950  @ 3.07GHz

File gres.conf is:
NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_x 
File=/dev/nvidia0 CPUs=0-1
NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_black 
File=/dev/nvidia1 CPUs=2-3

However, when I start daemon "slurmctld", system returns this error:
[2025-04-28T09:35:41.003] error: _check_core_range_matches_sock: gres/gpu GRES 
core specification 0-1 for node aopcvis5 doesn't match socket boundaries. 
(Socket 0 is cores 0-3)
[2025-04-28T09:35:41.003] error: Setting node aopcvis5 state to INVAL with 
reason:gres/gpu GRES core specification 0-1 for node aopcvis5 doesn't match 
socket boundaries. (Socket 0 is cores 0-3)

Where is my configuration error?

Thanks.
-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com

Reply via email to