Hi Mike,

What version of Slurm are you using?
If you are running a version of Slurm 20.11.x or newer,  a change in the scheduler behavior was made so that by default srun will not allow resources to be overlapped by job steps.
https://bugs.schedmd.com/show_bug.cgi?id=11863#c3

I would see if adding the --overlap flag to the srun call for the parent mpi process fixes the problem.
https://slurm.schedmd.com/srun.html#OPT_overlap


Thanks,
David

On 6/20/2023 4:08 AM, Vanhorn, Mike wrote:
I have a user who is submitting a job to slurm which requests 16 tasks, i.e.

#SBATCH --ntasks 16
#SBATCH –cpus-per-task 1

The slurm script runs an mpi program called Parent.mpi, which then (fails to) 
call 15 mpi child processes. He’s tried two different ways for the parent to 
spawn the children:


   1.  A system() call, such as system(“srun --ntasks=4  mpirun -np 4 
./child.mpi”) or system(“mpirun -np 4 ./child.mpi”)


   1.   MPI_Comm_Spawn


Both ways generate the following in the slurm output file:

srun: Job ### step creation temporarily disabled, retrying (Requested nodes are 
busy)
srun: error: Unable to create step for job ###: Job/step already completing or 
completed

So, basically, he’s requesting 16 tasks, one of which is used by the parent and 
the other 15 are supposed to get used by the children, but the children can’t 
use the other 16 because...well, I’m not sure why.

Is there something I need to change in the slurm.conf to allow this to work?

---
Mike VanHorn
Senior Computer Systems Administrator
College of Engineering and Computer Science
Wright State University
265 Russ Engineering Center
937-775-5157
michael.vanh...@wright.edu


Reply via email to