At 08:24 PM 3/14/2006, Robert G. Brown wrote:
..snip...
Using PVM is fairly easy to write programs that CAN be tuned by hand or
dynamically to a heterogeneous environment. In fact, PVM can actually
run a single program on multiple architectures -- one of my first
exposures to it was one of Vaidy's presentations in which he showed
scaling results for a computation that was being run in parallel across
a Cray, a cluster of Sun workstation, a cluster of DEC workstations
(this WAS 1992 and DEC still existed:-), and a cluster of I think HPs or
AIX boxes, cannot remember. One computation, four or five distinct
binaries, ethernet for all IPCs. Tres cool.
Even today, I'm not at all certain that a version of MPI exists that
can<< do this. Sure, with Linux nearly ubiquitous in production
cluster environments there is less incentive than there was a decade
plus ago, but even now PVM "could" be used to run a single computation
across e.g. i386 and x64 architectures, using native binaries on both
(not i386 compatibility binaries and libraries on both). PVM also gives
you fairly straightforward control over just how the job distributes
itself, permitting you (with some effort) to invoke multiple instances
of a job per node, respawn a worker task on a crashed node, and so on.
..snap...
I used to do this type of stuff with mpich: AIX machines (2 versions,
2 types of power chips, some single proc other 2way SMP), solaris
(with 2-way smp ultrasparc cpus), linux machines (several
distributions) on a variety of intel and amd machines, and once
during a test a Dec machine (only used it once to test this, can't
even remember the details. Was a 4-way smp machine. Might even have
the DEC-part wrong) to run distributed programs across the network.
Nice mix of bit-endian and little endian :-) The jobs started in a
parallel queue on the SP2 cluster of the computation center, spread
out to our own machines here at chemistry, some interactive machines
on the SP2 where I was allowed to run jobs (it was not _explicitely_
forbidden to have them started by a script or a job inside the
cluster so...) and some machines made temporarily availble for these
tests/runs at other research groups. Worked very nicely, no problems
in the communications, and since the algorithm I used was
self-balancing it scaled pretty good too. I gather Mpich-2 no longer
supports this type of heterogeniety nowadays ?
Luc Vereecken
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf