GNU Parallel ( http://www.gnu.org/software/parallel/ ) might allow for
similar workflows
On 11/21/20 3:56 AM, Lux, Jim (US 7140) via Beowulf wrote:
If Joe has interpreted your need correctly, I’ll second the suggestion
of pdsh – it’s simple, it works pretty well, it’s “transport”
independent (I use it to manage a cluster of beagleboards over WiFi).
Typically I wind up with a shell script on the head node and some shell
scripts on the worker nodes, and the head node script fires pdsh, which
starts the worker bee scripts.
*From: *Beowulf <beowulf-boun...@beowulf.org> on behalf of Joe Landman
<joe.land...@gmail.com>
*Date: *Friday, November 20, 2020 at 2:03 PM
*To: *"beowulf@beowulf.org" <beowulf@beowulf.org>
*Subject: *[EXTERNAL] Re: [Beowulf] perl with OpenMPI gotcha?
On 11/20/20 4:43 PM, David Mathog wrote:
[...]
Also, searching turned up very little information on using MPI with
perl.
(Lots on using MPI with other languages of course.)
The Parallel::MPI::Simple module is itself almost a decade old.
We have a batch manager but I would prefer not to use it in this case.
Is there some library/method other than MPI which people typically
use these days for this sort of compute cluster process control with
Perl from the head node?
I can't say I've ever used Perl and MPI. I suppose it is doable, but if
you were doing it, I'd recommend encapsulating it with FFI::Platypus
(https://metacpan.org/pod/FFI::Platypus
<https://urldefense.us/v3/__https:/metacpan.org/pod/FFI::Platypus__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faIxt-IWQ$>).
This however, doesn't seem tp be your problem per se. Your problem
sounds like "how do I launch a script on N compute nodes at once, and
wait for it to complete".
If I have that correct, then you want to learn about pdsh
(https://github.com/chaos/pdsh
<https://urldefense.us/v3/__https:/github.com/chaos/pdsh__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fa6GA3YiQ$>
and info here:
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/
<https://urldefense.us/v3/__https:/www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fatAD11t0$>
).
I write most of my admin scripts in perl, and you can use pdsh as a
function within them.
However ...
MCE::Loop is your friend.
Combine that with something like this:
$mounts=`ssh -o ConnectTimeout=20 $node grep o2ib /proc/mounts`;
and you can get pdsh-like control directly in Perl without invoking pdsh.
The general template looks like this:
#!/usr/bin/env perl
use strict;
use MCE::Loop;
MCE::Loop->init(
max_workers => 25, chunk_size => 1
);
my $nfile=shift;
# grab file contents into @nodes array
my @nodes;
chomp(@nodes = split(/\n/,`cat $nfile`));
# looping over nodes, max_workers at a time
mce_loop {
my ($mce, $chunk_ref, $chunk_id) = @_;
# do stuff to node $_
} @nodes;
This will run 25 copies (max_workers) of the loop body over the @nodes
array. Incorporate the ssh bit above in the #do stuff area, and you get
basically what I think you are asking for.
FWIW, I've been using this pattern for a few years, most recently on
large supers over the past few months.
Thanks,
David Mathog
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
<mailto:Beowulf@beowulf.org> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
<https://urldefense.us/v3/__https:/beowulf.org/cgi-bin/mailman/listinfo/beowulf__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fafXgrJN4$>
--
Joe Landman
e:joe.land...@gmail.com <mailto:joe.land...@gmail.com>
t: @hpcjoe
w:https://scalability.org
<https://urldefense.us/v3/__https:/scalability.org__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faOt9Xj2U$>
g:https://github.com/joelandman
<https://urldefense.us/v3/__https:/github.com/joelandman__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faYVvpwqA$>
l:https://www.linkedin.com/in/joelandman
<https://urldefense.us/v3/__https:/www.linkedin.com/in/joelandman__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7famxaMBkU$>
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf