Dear René,

this seems actually to be related to hyperthreading on Intel CPUs. We see the same behaviour on our Xeon Gold 6134 Nodes running Slurm 17.11.

The Slurm FAQ ( https://slurm.schedmd.com/faq.html#cpu_count ) states: "Note that even on systems with hyperthreading enabled, the resources will generally be allocated to jobs at the level of a core. Two different jobs will not share a core except through the use of a partition OverSubscribe configuration parameter. For example, a job requesting resources for three tasks on a node with ThreadsPerCore=2 will be allocated two full cores."

We haven't played with the OverSubscribe option, though, so no experience on that which I could share...

Best,
Christoph


On 13/09/2019 11.06, René Neumaier wrote:
Dear all,

recently we upgraded slurm from 18.08.3 to 18.08.7 and we noticed the
following the behaviour:

If we start the following test script:


#!/bin/bash

#SBATCH --job-name=test
#SBATCH --output=test.log
#SBATCH --error=test.err
#SBATCH --ntasks=3
#SBATCH --mem=100M
#SBATCH --qos=high

sleep 30


4CPUs instead of 3CPUs will be consumed - but only 3 will be shown via
'squeue'. The slurmctld.log shows also 3CPUs:
...
[2019-09-12T17:12:09.964] _slurm_rpc_submit_batch_job: JobId=40077
InitPrio=1000 usec=258
[2019-09-12T17:12:12.498] sched: Allocate JobId=40077 NodeList=mpcn5
#CPUs=3 Partition=lemmium
[2019-09-12T17:12:42.554] _job_complete: JobId=40077 WEXITSTATUS 0
[2019-09-12T17:12:42.554] _job_complete: JobId=40077 done
...

The same thing with 1CPU: 2 instead of 1 will be consumed.
But a job with 8CPUs consumed the correct amount of CPUs.

This depends also by the partition / cpu type:
Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz = affected
Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz = affected
Quad-Core AMD Opteron(tm) Processor 8356 = not affected

The old AMD Opteron CPUs allows only one thread per core. The Intel CPUs
using HyperThreading and allowing '2' threads per core.

Node example:
NodeName=mpcn2 CPUs=32 RealMemory=128992 Sockets=8 CoresPerSocket=4
ThreadsPerCore=1 State=UNKNOWN
NodeName=mpcn3 CPUs=80 RealMemory=773208 Sockets=2 CoresPerSocket=20
ThreadsPerCore=2 State=UNKNOWN

I think this difference is the reason why odd numbers in the CPU
specification are not possible for a job on the Intel partitions.

I have to admit that I am not sure if this is the case since the last
slurm update, if there is a wrong configuration or if a "half core" (1
thread) cannot be consumed.
Normally jobs were started with even numbers >=8 esp. on the larger nodes...

Gentoo GNU/Linux - Kernel 5.2.11
FastSchedule=1
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
TaskPlugin=task/cgroup
ProctrackType=proctrack/cgroup


I would be grateful for any idea.


Best regards,
René


--
Dr. Christoph Brüning
Universität Würzburg
Rechenzentrum
Am Hubland
D-97074 Würzburg
Tel.: +49 931 31-80499

Reply via email to