provide some thread-level parallelism on a cluster where you primarily use MPI to achieve your parallel execution.*
either threads for low-investment newbie parallelism or MPI for serious scalability.
Have you used compiler auto-parallel features mixed with MPI with success on your clusters?
only for running HPL (where, iirc, it wasn't a win). it makes sense that if you've taken the trouble to use MPI, you probably have more concurrency available than just threaded blas/lapack calls.
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf