The SPEC HPG (High Performance Group) is having discussions about using a hybrid of MPI and thread-level parallelism on the SPEC MPI2007 benchmark suite. We have a separate OpenMP suite (SPEC OMP2001), so we chose not to allow the source-code complications of hybrid MPI/OpenMP parallelism in this benchmark suite with "MPI" foremost in its name.
I was wondering how many people use either auto-parallel compiler features, or multi-threaded math libraries (Goto, MKL, ACML, etc.) to provide some thread-level parallelism on a cluster where you primarily use MPI to achieve your parallel execution.* Have you used compiler auto-parallel features mixed with MPI with success on your clusters? Have you used multi-threaded math or scientific libraries mixed with MPI with success on your clusters? If you just want to 'reply' to me only with simpler Yes/No answers, I will report on a summary of the results to this list and to the SPEC HPG committee. If you have success or failure stories that might be useful to the Beowulf list, please 'reply-all'. Thanks, Tom Elken, member SPEC HPG committee ----------------------------- * For example, if an autoparallelizing compiler could find effective 4-way thread-level parallelism in an MPI code and you were running on a cluster of 8 nodes each with two quad-core CPUs, 64 cores total, you might choose to run with 16 MPI threads and set your NUM_THREADS variable to 4, to run with all 64 cores of the cluster executing work with reasonable efficiency. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf