Hi;
You can submit each pimplefoam as a seperate job. or if you realy submit
as a single job, you can use a program to run each of them as much as
cpu count such as gnu parallel:
https://www.gnu.org/software/parallel/
regards;
Ahmet M.
10.10.2020 14:05 tarihinde Max Quast yazdı:
Dear slurm-users,
I built a slurm system consisting of two nodes (Ubuntu 20.04.1, slurm
20.02.5):
# COMPUTE NODES
GresTypes=gpu
NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073
Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN
PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE
State=UP
The slurmctl is running on a separate Ubuntu system where no slurmd is
installed.
If a user executes this script (sbatch srun2.bash)
#!/bin/bash
#SBATCH -N 2 -n9
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-10 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-11 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-12 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-13 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-14 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-15 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-16 -parallel > /dev/null &
srun pimpleFoam -case
/mnt/NFS/users/quast/channel395-17 -parallel > /dev/null &
wait
8 jobs with 9 threads are launched and distributed on two nodes.
If more such scripts get started at the same time, all the srun
commands will be executed even though no free cores are available. So
the nodes are overallocated.
How can this be prevented?
Thx :)
Greetings
max