Hello all,

We've recently (don't laugh!) updated two of our SLURM installations from 
16.05.10-2 to 20.11.3 and 17.11.9, respectively.  Now, OpenMPI doesn't seem to 
function in interactive mode across multiple nodes as it did previously on the 
latest version 20.11.3;  using `srun` and `mpirun` on a single node gives 
desired results, while using multiple nodes causes a hang. Jobs submitted via 
`sbatch` do _work as expected_.  

[desantis@sclogin0 ~]$ scontrol show config |grep VERSION; srun -n 2 -N 2-2 -t 
00:05:00 --pty /bin/bash
SLURM_VERSION           = 17.11.9
[desantis@sccompute0 ~]$ for OPENMPI in mpi/openmpi/1.8.5 mpi/openmpi/2.0.4 
mpi/openmpi/2.0.4-psm2 mpi/openmpi/2.1.6 mpi/openmpi/3.1.6 
compilers/intel/2020_cluster_xe; do module load $OPENMPI ; which mpirun; mpirun 
hostname; module purge; echo; done
/apps/openmpi/1.8.5/bin/mpirun
sccompute0
sccompute1

/apps/openmpi/2.0.4/bin/mpirun
sccompute1
sccompute0

/apps/openmpi/2.0.4-psm2/bin/mpirun
sccompute1
sccompute0

/apps/openmpi/2.1.6/bin/mpirun
sccompute0
sccompute1

/apps/openmpi/3.1.6/bin/mpirun
sccompute0
sccompute1

/apps/intel/2020_u2/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/mpirun
sccompute1
sccompute0


15:58:28 Mon Apr 26 <0>
desantis@itn0
[~] $ scontrol show config|grep VERSION; srun -n 2 -N 2-2 --qos=devel 
--partition=devel -t 00:05:00 --pty /bin/bash
SLURM_VERSION           = 20.11.3
srun: job 1019599 queued and waiting for resources
srun: job 1019599 has been allocated resources
15:58:46 Mon Apr 26 <0>
desantis@mdc-1057-30-1
[~] $ for OPENMPI in mpi/openmpi/1.8.5 mpi/openmpi/2.0.4 mpi/openmpi/2.0.4-psm2 
mpi/openmpi/2.1.6 mpi/openmpi/3.1.6 compilers/intel/2020_cluster_xe; do module 
load $OPENMPI ; which mpirun; mpirun hostname; module purge; echo; done
/apps/openmpi/1.8.5/bin/mpirun
^C
/apps/openmpi/2.0.4/bin/mpirun
^C
/apps/openmpi/2.0.4-psm2/bin/mpirun
^C
/apps/openmpi/2.1.6/bin/mpirun
^C
/apps/openmpi/3.1.6/bin/mpirun
^C
/apps/intel/2020_u2/compilers_and_libraries_2020.2.254/linux/mpi/intel64/bin/mpirun
^C[mpiexec@mdc-1057-30-1] Sending Ctrl-C to processes as requested
[mpiexec@mdc-1057-30-1] Press Ctrl-C again to force abort
^C

Our SLURM installations are fairly straight forward.  We `rpmbuild` directly 
from the bzip2 files without any additional arguments.  We've done this since 
we first started using SLURM with version 14.03.3-2 and through all upgrades.  
Due to SLURM's awesomeness(!), we've simply used the same configuration files 
between version changes, with the only changes being made to parameters which 
have been deprecated/renamed.  Our "Mpi{Default,Params}" have always been sent 
to "none".  The only real difference we're able to ascertain is that the MPI 
plugin for openmpi has been removed.

svc-3024-5-2: SLURM_VERSION           = 16.05.10-2
svc-3024-5-2: srun: MPI types are...
svc-3024-5-2: srun: mpi/openmpi
svc-3024-5-2: srun: mpi/mpich1_shmem
svc-3024-5-2: srun: mpi/mpichgm
svc-3024-5-2: srun: mpi/mvapich
svc-3024-5-2: srun: mpi/mpich1_p4
svc-3024-5-2: srun: mpi/lam
svc-3024-5-2: srun: mpi/none
svc-3024-5-2: srun: mpi/mpichmx
svc-3024-5-2: srun: mpi/pmi2

viking: SLURM_VERSION           = 20.11.3
viking: srun: MPI types are...
viking: srun: cray_shasta
viking: srun: pmi2
viking: srun: none

sclogin0: SLURM_VERSION           = 17.11.9
sclogin0: srun: MPI types are...
sclogin0: srun: openmpi
sclogin0: srun: none
sclogin0: srun: pmi2
sclogin0: 

As far as building OpenMPI, we've always withheld any SLURM specific flags, 
i.e. "--with-slurm", although during the build process SLURM is detected.  

Because OpenMPI was always built using this method, we never had to recompile 
OpenMPI after subsequent SLURM upgrades, and no cluster ready applications had 
to be rebuilt.  The only time OpenMPI had to be rebuilt was due to OPA hardware 
which was a simple addition of the "--with-psm2" flag.

It is my understanding that the openmpi plugin "never really did anything" (per 
perusing the mailing list), which is why it was removed.  Furthermore, 
searching the mailing list suggests that the appropriate method is to use 
`salloc` first, despite version 17.11.9 not needing `salloc` for an 
"interactive" sessions.

Before we go further down this rabbit hole, were other sites affected with a 
transition from SLURM versions 16.x,17.x,18.x(?) to versions 20.x?  If so, did 
the methodology for multinode interactive MPI sessions change?

Thanks!
John DeSantis




Reply via email to