Hello, I have compiled SLURM-24.11.3 and I have configured two GPUs in my system (slurmctld and slurmd are running in the same computer). Computes has a old processor Intel i7 with 4 cores and 4 hyperthreading. Node is configured with "NodeName=mysystem CPUs=8 Boards=1 SocketsPerBoard=1 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=7940 Gres=gpu:geforce_gtx_titan_x:1,gpu:geforce_gtx_titan_black:1". "lscpu" command returns:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel BIOS Vendor ID: Intel(R) Corporation CPU family: 6 Model: 26 Model name: Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz BIOS Model name: Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz File gres.conf is: NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_x File=/dev/nvidia0 CPUs=0-1 NodeName=mysystem Autodetect=off Name=gpu Type=geforce_gtx_titan_black File=/dev/nvidia1 CPUs=2-3 However, when I start daemon "slurmctld", system returns this error: [2025-04-28T09:35:41.003] error: _check_core_range_matches_sock: gres/gpu GRES core specification 0-1 for node aopcvis5 doesn't match socket boundaries. (Socket 0 is cores 0-3) [2025-04-28T09:35:41.003] error: Setting node aopcvis5 state to INVAL with reason:gres/gpu GRES core specification 0-1 for node aopcvis5 doesn't match socket boundaries. (Socket 0 is cores 0-3) Where is my configuration error? Thanks.
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com