Are you sure that /share/local/pmix-3.2.1 exists on the compute nodes?
On 12/4/2020 2:54 PM, Yuengling, Philip J. wrote:
Hi everyone,
I’ve been having difficulty getting the --mpi=pmix_v3 option to work
for me. I can get --mpi=pmi2 to work ok, but I really want to
understand what I’m doing wrong here. Everything seems to build ok.
$ srun --mpi=list
srun: MPI types are...
srun: pmix
srun: pmix_v3
srun: cray_shasta
srun: none
srun: pmi2
$ srun --mpi=pmix_v3 -N5 date
srun: error: task 1 launch failed: Invalid MPI plugin name
srun: error: task 2 launch failed: Invalid MPI plugin name
srun: error: task 3 launch failed: Invalid MPI plugin name
srun: error: task 4 launch failed: Invalid MPI plugin name
srun: error: task 0 launch failed: Invalid MPI plugin name
$ srun --mpi=pmi2 -N5 date
Fri Dec 4 13:52:39 EST 2020
Fri Dec 4 13:52:39 EST 2020
Fri Dec 4 13:52:39 EST 2020
Fri Dec 4 13:52:39 EST 2020
Fri Dec 4 13:52:39 EST 2020
openpmix:
CC=/opt/rh/devtoolset-10/root/usr/bin/gcc ./configure
--prefix=/share/local/pmix-3.2.1 --with-hwloc=/share/local/hwloc-2.4.0
Slurm 20.11.0:
rpmbuild --define "_with_pmix --with-pmix=/fs/local/pmix-3.2.1" -ta
slurm-20.11.0.tar.bz2
From config.log:
./configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr
--bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc/slurm
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64
--libexecdir=/usr/libexec --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man
--infodir=/usr/share/info --with-pmix=/fs/local/pmix-3.2.1
--disable-slurmrestd
Open MP 4.0.5:
./configure '--prefix=/share/openmpi-4.0.5' '--with-cuda'
'--with-pmix=/share/local/pmix-3.2.1' '--with-pmi=/usr' '--with-slurm'
'--without-ucx' '--without-verbs'
--
Philip J. Yuengling
Johns Hopkins University