Mark Hahn wrote:
IMHO the hybris approach (MPI+threads) is interesting in case every
MPI-process has lots of local data.
yes. but does this happen a lot? the appealing case would be threads
that make lots of heavy use of some large data, _but_
without needing synchronization/locking. once you need locking
among the threads, message passing starts to catch up.
Direct solvers (for Finite Elements for instance) need a lot of data.
Additionally distributing the matrix generate interfaces (between the
different submatrices) which are hard to solve. In such situation, one
tries to minimize the number of interfaces (by having one submatrix per
MPI-process) and speed up the solving of each submatrix using threads.
Finance is another example. Financial applications need to evaluate a
large number of open positions based on the simulated, current or past
market-data. There are many dependencies between all the different data
which makes that it is hard to decompose the data in largely independent
chunks.
latter is simpler because it only requires MPI-parallelism but if the
code
is memory-bound and every mpi-process has much of the same data, it
will be
better to share this common data with all processes on the same cpu
and thus
use threads intra-node.
what kind of applications behave like that? I agree that if your MPI
app is keeping huge amounts of (static) data replicated in each rank,
you should rethink your design.
See above.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf