Hi Chris,

On Tuesday, 10 December 2019 11:49:44 AM PST Chris Woelkers - NOAA Federal 
wrote:

> Test jobs, submitted via sbatch, are able to run on one node with no problem
> but will not run on multiple nodes. The jobs are using mpirun and mvapich2
> is installed.

Is there a reason why you aren't using srun for launching these?

https://slurm.schedmd.com/mpi_guide.html

If you're using mpirun then (unless you've built mvapich2 with Slurm support) 
then you'll be relying on ssh to launch tasks and so that could be what's 
broken for you.  Running with srun will avoid that and allow Slurm to track 
your processes correctly.

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA




Reply via email to