Hi:

I’ve just started watching “the ABCs of OpenMPI” videos (they and slides are 
available at https://www.open-mpi.org/video/?category=general ).  I’ve not 
finished part 1 (of 3) yet, and already have seen very-well-explained material 
specifically about the building of OpenMPI, Slurm, and related libraries.

These are a fantastic resource!

--
Paul Brunk, system administrator
Advanced Computing Resource Center
Enterprise IT Svcs, the University of Georgia


On 3/27/23, 2:29 PM, "slurm-users" <slurm-users-boun...@lists.schedmd.com> 
wrote:
You don't often get email from cfre...@super.org<mailto:cfre...@super.org>. 
Learn why this is important 
<https://aka.ms/LearnAboutSenderIdentification><https://aka.ms/LearnAboutSenderIdentification%3e>
[EXTERNAL SENDER - PROCEED CAUTIOUSLY]


Can someone please clarify the "best practices" for building OpenMPI compatible 
with Slurm?

https://slurm.schedmd.com/mpi_guide.html#open_mpi 
<https://slurm.schedmd.com/mpi_guide.html#open_mpi> tells me what I _can_ do 
but I'm unclear as to what I _should_ do.

I've built OpenMPI 4.1.5 with: --with-pmix --with-libevent=internal 
--with-hwloc=internal --with-slurm. If I run an MPI program on my cluster 
(slurm 18.08.8) with "srun -N2 foo" it seems to work fine. (slurm.conf has 
MpiDefault=pmix).

If I "srun --mpi=openmpi -N2 foo" it chokes with:

OPAL_ERROR: Unreachable in file 
../../../../../opal/mca/pmix/pmix3/pmix3x_client.c at line 112
-------------------------------------------------------------------------------------------------------------------
This application appears to have been direct launched using "srun",
but OMPI was not build with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

version 16.05 or later: you can use SLURM's PMIx support. THis
require that you configure and uild SLURM --with-pmix.
.
.
.

So I guess the question is, what is the "right" way to build OpenMPI with 
Slurm. Is the fact that my non-Slurm pmix works "correct" or am I just getting 
lucky that the various software I have just happens to be compatible. If I 
build OpenMPI am I supposed to use Slurm's pmix/libevent/hwloc or is that 
optional. If it's optional when/why might I choose to do so. If I need Slurm's 
versions is there some way to find which pmix/libevent/hwloc my current Slurm 
install is using? Note: my sysadmins are not going to be helpful as they think 
Slurm 18 and OpenMPI 4.0.2a is adequate for users' needs :^(.

I like the idea of _not_ tying my OpenMPI to the installed Slurm just in case 
our support people ever decide to upgrade system software.

Thanks.





Reply via email to