Hi Mike,
What version of Slurm are you using?
If you are running a version of Slurm 20.11.x or newer, a change in the
scheduler behavior was made so that by default srun will not allow
resources to be overlapped by job steps.
https://bugs.schedmd.com/show_bug.cgi?id=11863#c3
I would see if adding the --overlap flag to the srun call for the parent
mpi process fixes the problem.
https://slurm.schedmd.com/srun.html#OPT_overlap
Thanks,
David
On 6/20/2023 4:08 AM, Vanhorn, Mike wrote:
I have a user who is submitting a job to slurm which requests 16 tasks, i.e.
#SBATCH --ntasks 16
#SBATCH –cpus-per-task 1
The slurm script runs an mpi program called Parent.mpi, which then (fails to)
call 15 mpi child processes. He’s tried two different ways for the parent to
spawn the children:
1. A system() call, such as system(“srun --ntasks=4 mpirun -np 4
./child.mpi”) or system(“mpirun -np 4 ./child.mpi”)
1. MPI_Comm_Spawn
Both ways generate the following in the slurm output file:
srun: Job ### step creation temporarily disabled, retrying (Requested nodes are
busy)
srun: error: Unable to create step for job ###: Job/step already completing or
completed
So, basically, he’s requesting 16 tasks, one of which is used by the parent and
the other 15 are supposed to get used by the children, but the children can’t
use the other 16 because...well, I’m not sure why.
Is there something I need to change in the slurm.conf to allow this to work?
---
Mike VanHorn
Senior Computer Systems Administrator
College of Engineering and Computer Science
Wright State University
265 Russ Engineering Center
937-775-5157
michael.vanh...@wright.edu