Bug#984956: me too

2021-05-12 Thread Vassilis Virvilis
On Wed, 12 May 2021 21:54:01 +0200 Lucas Nussbaum wrote: > With OpenMPI 4.1.0-4, downloaded from > https://snapshot.debian.org/package/openmpi/4.1.0-4/, things worked > fine. > That version used the internal PMIx library. > > Lucas > > That's a great tip. I will test it tomorrow and if it works i

Bug#984956: me too

2021-05-12 Thread Lucas Nussbaum
The problem is also reproducible with the current tip of the v4.1.x from OpenMPI upstream. Also there has been some activity in the upstream bug https://github.com/open-mpi/ompi/issues/8596 . Nice! Lucas

Bug#984956: me too

2021-05-12 Thread Lucas Nussbaum
With OpenMPI 4.1.0-4, downloaded from https://snapshot.debian.org/package/openmpi/4.1.0-4/, things worked fine. That version used the internal PMIx library. Lucas

Bug#984956: me too

2021-05-12 Thread Lucas Nussbaum
On 12/05/21 at 12:47 +0300, Vassilis Virvilis wrote: > > > However in /usr/lib/x86_64-linux-gnu/pmix2/include/pmix_common.h the > list > > > goes further ending in PMIX_COMPRESSED_BYTE_OBJECT 59 with 56 being > > > PMIX_TOPO > > > > I tried to ensure that pmix_common.h (inside the sources) was unus

Bug#984956: me too

2021-05-12 Thread Vassilis Virvilis
> > However in /usr/lib/x86_64-linux-gnu/pmix2/include/pmix_common.h the list > > goes further ending in PMIX_COMPRESSED_BYTE_OBJECT 59 with 56 being > > PMIX_TOPO > > I tried to ensure that pmix_common.h (inside the sources) was unused > during build, and added an #error inside it. The build succe

Bug#984956: Processed: Re: Bug#984956: me too

2021-05-11 Thread Vassilis Virvilis
On Tue, 11 May 2021 14:04:56 +0200 Lucas Nussbaum wrote: > That's because it is loaded dynamically. >> mca_pmix_ext3x.so is linked to libpmix.so.2: >> > Aah that's a great hint. Thanks The output is the same as yours. It's not available. I rebuilt it locally, and got: > Yes I also rebuilt it l

Bug#984956: Processed: Re: Bug#984956: me too

2021-05-11 Thread Lucas Nussbaum
On 11/05/21 at 14:48 +0300, Vassilis Virvilis wrote: > I believe the problem is that mpirun is built with the internal pmix > library when there is external available. > > bill@odin:~/src/openmpi-4.1.0$ dpkg -l '*pmix*' | grep ^ii > ii libpmix-dev:amd64 4.0.0-4 amd64Development files

Bug#984956: me too

2021-05-11 Thread Lucas Nussbaum
On 07/05/21 at 17:24 +0300, Vassilis Virvilis wrote: > The second value (index == 1) has value.type = 56 (pmix.topo2) which is > outside the range of supported value types. I think the last entry is > PMIX_REGEX 46 at > ./debian/build-gfortran/opal/mca/pmix/pmix3x/pmix/include/pmix_common.h > > H

Bug#984956: me too

2021-05-11 Thread Lucas Nussbaum
Control: severity -1 serious Hi, This breaks OpenMPI in very basic cases, so I'm upgrading the severity to serious. Lucas

Bug#984956: me too

2021-05-07 Thread Vassilis Virvilis
Ok I think I made some headway but I would welcome some insight from somebody more knowledgeable I think the problem is a potential mixup of the internal vs the external pmix library in openmpi. In my setup the call to ` rc = PMIx_Get(&p, key, pinfo, sz, &pval); ` at ext3x_client.c:656 fills the

Bug#984956: Me too

2021-03-21 Thread sixerjman
I am hit by this bug also. Going to try and learn SLURM in the meantime. Is there any progress? Thank you for your work on this excellent package.