Make sure the .so symlink for the pmix lib is available -- not just the 
versioned .so, e.g. .so.2.   Slurm requires that .so symlink.  Some distros 
split packages into base/devel, so you may need to install a pmix-devel 
package, if available, in order to add the .so symlink (which is considered a 
"development" file).
    On Monday, December 7, 2020, 09:22:06 AM EST, Yuengling, Philip J. 
<philip.yuengl...@jhuapl.edu> wrote:  
 
 #yiv8648130390 #yiv8648130390 -- _filtered {} _filtered {}#yiv8648130390 
#yiv8648130390 p.yiv8648130390MsoNormal, #yiv8648130390 
li.yiv8648130390MsoNormal, #yiv8648130390 div.yiv8648130390MsoNormal 
{margin:0in;font-size:12.0pt;font-family:sans-serif;}#yiv8648130390 
span.yiv8648130390EmailStyle20 
{font-family:sans-serif;color:windowtext;}#yiv8648130390 
.yiv8648130390MsoChpDefault {font-size:10.0pt;} _filtered {}#yiv8648130390 
div.yiv8648130390WordSection1 {}#yiv8648130390 
Thanks Andy,
 
  
 
Slurm was compiled with --with-pmix=/share/local/pmix-3.2.1.  The build of pmix 
is installed under /share/local/pmix-3.2.1 which is an NFS share across all the 
nodes.  I should also note I used devtoolset-10 (gcc 10) on RHEL7 and confirmed 
that everything was compiled with that version of compiler.
 
  
 
I also set LD_LIBRARY_PATH to include /share/local/pmix-3.2.1
 
  
 
Cheers!
 
Phil
 
  
 
From: slurm-users <slurm-users-boun...@lists.schedmd.com> on behalf of Andy 
Riebs <a...@candooz.com>
Reply-To: "a...@candooz.com" <a...@candooz.com>, Slurm User Community List 
<slurm-users@lists.schedmd.com>
Date: Friday, December 4, 2020 at 3:07 PM
To: "slurm-users@lists.schedmd.com" <slurm-users@lists.schedmd.com>
Subject: [EXT] Re: [slurm-users] pmix issue
 
  
 
| 
APL external email warning: Verify sender slurm-users-boun...@lists.schedmd.com 
before clicking links or attachments
  |


 
 
Also, Slurm was built with "/fs/local/pmix-3.2.1" -- does that translate well 
to "/share/local/pmix-3.2.1"?
 
Andy
 
On 12/4/2020 2:59 PM, Andy Riebs wrote:
 

Are you sure that /share/local/pmix-3.2.1 exists on the compute nodes?
 
On 12/4/2020 2:54 PM, Yuengling, Philip J. wrote:
 

Hi everyone,
 
 
 
I’ve been having difficulty getting the --mpi=pmix_v3 option to work for me.  I 
can get --mpi=pmi2 to work ok, but I really want to understand what I’m doing 
wrong here.  Everything seems to build ok.
 
 
 
$ srun --mpi=list
 
srun: MPI types are...
 
srun: pmix
 
srun: pmix_v3
 
srun: cray_shasta
 
srun: none
 
srun: pmi2
 
 
 
$ srun --mpi=pmix_v3 -N5 date
 
srun: error: task 1 launch failed: Invalid MPI plugin name
 
srun: error: task 2 launch failed: Invalid MPI plugin name
 
srun: error: task 3 launch failed: Invalid MPI plugin name
 
srun: error: task 4 launch failed: Invalid MPI plugin name
 
srun: error: task 0 launch failed: Invalid MPI plugin name
 
 
 
$ srun --mpi=pmi2 -N5 date
 
Fri Dec  4 13:52:39 EST 2020
 
Fri Dec  4 13:52:39 EST 2020
 
Fri Dec  4 13:52:39 EST 2020
 
Fri Dec  4 13:52:39 EST 2020
 
Fri Dec  4 13:52:39 EST 2020
 
 
 
 
 
openpmix:
 
CC=/opt/rh/devtoolset-10/root/usr/bin/gcc ./configure 
--prefix=/share/local/pmix-3.2.1 --with-hwloc=/share/local/hwloc-2.4.0
 
 
 
Slurm 20.11.0:
 
rpmbuild --define "_with_pmix --with-pmix=/fs/local/pmix-3.2.1" -ta 
slurm-20.11.0.tar.bz2
 
>From config.log:
 
./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu 
--program-prefix= --disable-dependency-tracking --prefix=/usr 
--exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin 
--sysconfdir=/etc/slurm --datadir=/usr/share --includedir=/usr/include 
--libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var 
--sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info 
--with-pmix=/fs/local/pmix-3.2.1 --disable-slurmrestd
 
 
 
Open MP 4.0.5: 
 
./configure  '--prefix=/share/openmpi-4.0.5' '--with-cuda' 
'--with-pmix=/share/local/pmix-3.2.1' '--with-pmi=/usr' '--with-slurm' 
'--without-ucx' '--without-verbs'
 
-- 
 
 
 
Philip J. Yuengling
 
Johns Hopkins University
 


--> 
   

Reply via email to