Hello all, Thanks for the useful observations. Here is some further env vars:
# non problematic case $ srun -c 3 --partition=gpu-2080ti env SRUN_DEBUG=3 SLURM_JOB_CPUS_PER_NODE=4 SLURM_NTASKS=1 SLURM_NPROCS=1 SLURM_CPUS_PER_TASK=3 SLURM_STEP_ID=0 SLURM_STEPID=0 SLURM_NNODES=1 SLURM_JOB_NUM_NODES=1 SLURM_STEP_NUM_NODES=1 SLURM_STEP_NUM_TASKS=1 SLURM_STEP_TASKS_PER_NODE=1 SLURM_CPUS_ON_NODE=4 SLURM_NODEID=0 *SLURM_PROCID=0SLURM_LOCALID=0SLURM_GTIDS=0* # problematic case - prints two sets of env vars $ srun -c 1 --partition=gpu-2080ti env SRUN_DEBUG=3 SLURM_JOB_CPUS_PER_NODE=2 SLURM_NTASKS=2 SLURM_NPROCS=2 SLURM_CPUS_PER_TASK=1 SLURM_STEP_ID=0 SLURM_STEPID=0 SLURM_NNODES=1 SLURM_JOB_NUM_NODES=1 SLURM_STEP_NUM_NODES=1 SLURM_STEP_NUM_TASKS=2 SLURM_STEP_TASKS_PER_NODE=2 SLURM_CPUS_ON_NODE=2 SLURM_NODEID=0 *SLURM_PROCID=0SLURM_LOCALID=0* *SLURM_GTIDS=0,1* SRUN_DEBUG=3 SLURM_JOB_CPUS_PER_NODE=2 SLURM_NTASKS=2 SLURM_NPROCS=2 SLURM_CPUS_PER_TASK=1 SLURM_STEP_ID=0 SLURM_STEPID=0 SLURM_NNODES=1 SLURM_JOB_NUM_NODES=1 SLURM_STEP_NUM_NODES=1 SLURM_STEP_NUM_TASKS=2 SLURM_STEP_TASKS_PER_NODE=2 SLURM_CPUS_ON_NODE=2 SLURM_NODEID=0 *SLURM_PROCID=1SLURM_LOCALID=1SLURM_GTIDS=0,1* Please see the ones in bold. @Hermann Schwärzler how do you plan to manage this bug? We have currently set SLURM_NTASKS_PER_NODE=1 clusterwide. Best, Durai On Fri, Mar 25, 2022 at 12:45 PM Juergen Salk <juergen.s...@uni-ulm.de> wrote: > Hi Bjørn-Helge, > > that's very similar to what we did as well in order to avoid confusion with > Core vs. Threads vs. CPU counts when Hyperthreading is kept enabled in the > BIOS. > > Adding CPUs=<core_count> (not <thread_count>) will tell Slurm to only > schedule physical cores. > > We have > > SelectType=select/cons_res > SelectTypeParameters=CR_Core_Memory > > and > > NodeName=DEFAULT CPUs=48 Sockets=2 CoresPerSocket=24 ThreadsPerCore=2 > > This is for compute nodes that have 2 sockets, 2 x 24 physical cores > with hyperthreading enabled in the BIOS. (Although, in general, we do > not encourage our users to make use of hyperthreading, we have decided > to leave it enabled in the BIOS as there are some corner cases that > are known to benefit from hyperthreading.) > > With this setting Slurm does also show the total physical core > counts instead of the thread counts and also treats the --mem-per-cpu > option as "--mem-per-core" which is in our case what most of our users > expect. > > As to the number of tasks spawned with `--cpus-per-task=1´, I think this > is intended behavior. The following sentence from the srun manpage is > probably relevant: > > -c, --cpus-per-task=<ncpus> > > If -c is specified without -n, as many tasks will be allocated per > node as possible while satisfying the -c restriction. > > In our configuration, we allow multiple jobs to run for the same user > on a node (ExclusiveUser=yes) and we get > > $ srun -c 1 echo foo | wc -l > 1 > $ > > However, in case of CPUs=<thread_count> instead of CPUs=<core_count>, > I guess, this would have been 2 lines of output, because the smallest > unit to schedule for a job is 1 physical core which allows 2 tasks to > run with hyperthreading enabled. > > In case of exclusive node allocation for jobs (i.e. no node > sharing allowed) Slurm would give all cores of a node to the job > which allows even more tasks to be spawned: > > $ srun --exclusive -c 1 echo foo | wc -l > 48 > $ > > 48 lines correspond exactly to the number of physical cores on the > node. Again, with CPUs=<thread_count> instead of CPUs=<core_count>, I > would expect 2 x 48 = 96 lines of output, but I did not test that. > > Best regards > Jürgen > > > * Bjørn-Helge Mevik <b.h.me...@usit.uio.no> [220325 08:49]: > > For what it's worth, we have a similar setup, with one crucial > > difference: we are handing out physical cores to jobs, not hyperthreads, > > and we are *not* seeing this behaviour: > > > > $ srun --cpus-per-task=1 -t 10 --mem-per-cpu=1g -A nn9999k -q devel echo > foo > > srun: job 5371678 queued and waiting for resources > > srun: job 5371678 has been allocated resources > > foo > > $ srun --cpus-per-task=3 -t 10 --mem-per-cpu=1g -A nn9999k -q devel echo > foo > > srun: job 5371680 queued and waiting for resources > > srun: job 5371680 has been allocated resources > > foo > > > > We have > > > > SelectType=select/cons_tres > > SelectTypeParameters=CR_CPU_Memory > > > > and node definitions like > > > > NodeName=DEFAULT CPUs=40 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 > RealMemory=182784 Gres=localscratch:330G Weight=1000 > > > > (so we set CPUs to the number of *physical cores*, not *hyperthreads*). > > > > -- > > Regards, > > Bjørn-Helge Mevik, dr. scient, > > Department for Research Computing, University of Oslo > > > > > > -- > Jürgen Salk > Scientific Software & Compute Services (SSCS) > Kommunikations- und Informationszentrum (kiz) > Universität Ulm > Telefon: +49 (0)731 50-22478 > Telefax: +49 (0)731 50-22471 > >