Hi;

You can submit each pimplefoam as a seperate job. or if you realy submit as a single job, you can use a program to run each of them as much as cpu count such as gnu parallel:

https://www.gnu.org/software/parallel/

regards;

Ahmet M.


10.10.2020 14:05 tarihinde Max Quast yazdı:

Dear slurm-users,

I built a slurm system consisting of two nodes (Ubuntu 20.04.1, slurm 20.02.5):

                # COMPUTE NODES

GresTypes=gpu

NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073 Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN

PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE State=UP

The slurmctl is running on a separate Ubuntu system where no slurmd is installed.

If a user executes this script (sbatch srun2.bash)

#!/bin/bash

                #SBATCH -N 2 -n9

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-10 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-11 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-12 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-13 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-14 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-15 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-16 -parallel > /dev/null &

                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-17 -parallel > /dev/null &

                wait

8 jobs with 9 threads are launched and distributed on two nodes.

If more such scripts get started at the same time, all the srun commands will be executed even though no free cores are available. So the nodes are overallocated.

How can this be prevented?

Thx :)

Greetings

max


Reply via email to