Dear René,
this seems actually to be related to hyperthreading on Intel CPUs. We
see the same behaviour on our Xeon Gold 6134 Nodes running Slurm 17.11.
The Slurm FAQ ( https://slurm.schedmd.com/faq.html#cpu_count ) states:
"Note that even on systems with hyperthreading enabled, the resources
will generally be allocated to jobs at the level of a core. Two
different jobs will not share a core except through the use of a
partition OverSubscribe configuration parameter. For example, a job
requesting resources for three tasks on a node with ThreadsPerCore=2
will be allocated two full cores."
We haven't played with the OverSubscribe option, though, so no
experience on that which I could share...
Best,
Christoph
On 13/09/2019 11.06, René Neumaier wrote:
Dear all,
recently we upgraded slurm from 18.08.3 to 18.08.7 and we noticed the
following the behaviour:
If we start the following test script:
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=test.log
#SBATCH --error=test.err
#SBATCH --ntasks=3
#SBATCH --mem=100M
#SBATCH --qos=high
sleep 30
4CPUs instead of 3CPUs will be consumed - but only 3 will be shown via
'squeue'. The slurmctld.log shows also 3CPUs:
...
[2019-09-12T17:12:09.964] _slurm_rpc_submit_batch_job: JobId=40077
InitPrio=1000 usec=258
[2019-09-12T17:12:12.498] sched: Allocate JobId=40077 NodeList=mpcn5
#CPUs=3 Partition=lemmium
[2019-09-12T17:12:42.554] _job_complete: JobId=40077 WEXITSTATUS 0
[2019-09-12T17:12:42.554] _job_complete: JobId=40077 done
...
The same thing with 1CPU: 2 instead of 1 will be consumed.
But a job with 8CPUs consumed the correct amount of CPUs.
This depends also by the partition / cpu type:
Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz = affected
Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz = affected
Quad-Core AMD Opteron(tm) Processor 8356 = not affected
The old AMD Opteron CPUs allows only one thread per core. The Intel CPUs
using HyperThreading and allowing '2' threads per core.
Node example:
NodeName=mpcn2 CPUs=32 RealMemory=128992 Sockets=8 CoresPerSocket=4
ThreadsPerCore=1 State=UNKNOWN
NodeName=mpcn3 CPUs=80 RealMemory=773208 Sockets=2 CoresPerSocket=20
ThreadsPerCore=2 State=UNKNOWN
I think this difference is the reason why odd numbers in the CPU
specification are not possible for a job on the Intel partitions.
I have to admit that I am not sure if this is the case since the last
slurm update, if there is a wrong configuration or if a "half core" (1
thread) cannot be consumed.
Normally jobs were started with even numbers >=8 esp. on the larger nodes...
Gentoo GNU/Linux - Kernel 5.2.11
FastSchedule=1
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
TaskPlugin=task/cgroup
ProctrackType=proctrack/cgroup
I would be grateful for any idea.
Best regards,
René
--
Dr. Christoph Brüning
Universität Würzburg
Rechenzentrum
Am Hubland
D-97074 Würzburg
Tel.: +49 931 31-80499