I'm hoping one of you has been to the end of this road already and can point out what is going wrong.

I have some perl scripts which have been carried along for a couple of decades now which use PVM to start simple jobs on the compute nodes, wait for them to finish (listing jobs as they close out), and then cleans up. Since this is the only thing which PVM is used for it seemed like it might be (way past) time to migrate that to MPI, specifically OpenMPI 4.0.1, which is what is on the cluster.

There are apparently tricks required, either that, or the test script does not run on a single standalone machine, or perhaps OpenMPI is not configured right?

There are already modules for OpenMPI and bioperl, and I decided to install Parallel::MPI::Simple into the latter, since it holds all the perl
modules which were not installed with dnf on this CentOS 8 system.  Like so:

  module load bioperl
  module load OpenMPI
  cd /usr/common/src/perl_modules
  cpanm -l $ROOT_BIOPERL Parallel::MPI::Simple 2>&1 \
    | tee install_perl_parallel_mpi_simple_2020_11_20.log

(no errors or warnings).

There is a little test program "ic.pl" which comes with Parallel::MPI::Simple,
however just invoking it turns up that it cannot find Simple.so. I have been down this road before with Perl and MPI with the "Maker" program - some libraries must be preloaded or they just will not be found by Perl. Once that is done all the missing library and symbol errors go away. But it still does not run:

LD_PRELOAD=/usr/common/modules/el8/x86_64/software/bioperl/1.7.7-CentOS-vanilla/lib/perl5/x86_64-linux-thread-multi/auto/Parallel/MPI/Simple/Simple.so:/opt/ompi401/lib/libmpi.so $ROOT_BIOPERL/lib/perl5/x86_64-linux-thread-multi/Parallel/MPI/ic.pl
[poweredge:04423] *** An error occurred in MPI_Send
[poweredge:04423] *** reported by process [603979777,0]
[poweredge:04423] *** on communicator MPI_COMM_WORLD
[poweredge:04423] *** MPI_ERR_RANK: invalid rank
[poweredge:04423] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[poweredge:04423] ***    and potentially your MPI job)


Any idea what might be wrong here?

Also, searching turned up very little information on using MPI with perl.
(Lots on using MPI with other languages of course.)
The Parallel::MPI::Simple module is itself almost a decade old.
We have a batch manager but I would prefer not to use it in this case.
Is there some library/method other than MPI which people typically use these days for this sort of compute cluster process control with Perl from the head node?

Thanks,

David Mathog



_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to