Hi

What is the CLASS you are using ?

One point is how the OpenMP version leads with memory, in relation to MPI version

I have two  sugestions

I will be a good ideia to try to use the Intel compiler and
Why not download and run the multi-zone NAS benchmarks ?
It can give a complete view of the "problem"

They have :

   NPB3.2-MZ-SER:  a serial version
   NPB3.2-MZ-MPI:  a hybrid MPI + OpenMP version
   NPB3.2-MZ-SMP:  a hybrid SMP + OpenMP version

I run in a very small clusters 4 core 2 duo, using Intel compilers and it seens
and for the CLASS=C  I get this numbers for the MPI version and MPI+ OpenMP
with one (Hibrid(1))and two (Hibrib(2)) threads


N. Processors  -     MPI     -  Hibrid (1)  -  Hibrid (2)
       1               -  2496.59   -       *           - 1682.07
2 - 1085.58 - * - 846.82 4 - 624.69 - 674.28 - 498.65 8 - 447.7 - 467.79 - *
Unfortunate they dont have a version with only OpenMP I and I dont know if the results with one MPI process and several threads cam be usefull.

Notice that the benchmarks are different LU version from the NPB2.0
they change the "memory acess" .


Renato


Douglas Eadline wrote:

I like answering these types of questions with numbers,
so in my Sept 2007 Linux magazine column (which should
be showing up on the website soon) I did the following.

Downloaded the latest NAS benchmarks written in both
OpenMP and MPI. Ran them both on an 8 core Clovertown
(dual socket) system (multiple times) and reported
the following results:

Test      OpenMP              MPI
      gcc/gfortran 4.2    LAM 7.1.2
------------------------------------
CG         790.6             739.1
EP         166.5             162.8
FT        3535.9            2090.8
IS          51.1             122.5
LU        5620.5            5168.8
MG        1616.0            2046.2

My conclusion, it was a draw of sorts.
The article was basically looking at the
lazy assumption that threads (OpenMP) are
always better than MPI on a SMP  machine.

I'm going to re-run the tests using Harpertowns
real soon, maybe try other compilers and MPI
versions. It is easy to do. You can get the code here:

http://www.nas.nasa.gov/Resources/Software/npb.html

--
Doug









On this list there is almost unanimous agreement that MPI is the way to go
for parallelism and that combining multi-threading (MT) and
message-passing
(MP) is not even worth it, just sticking to MP is all that is necessary.

However, in real-life most are talking and investing in MT while very few
are interested in MP. I also just read on the blog of Arch Robison " TBB
perhaps gives up a little performance short of optimal so you don't have
to
write message-passing " (here:
http://softwareblogs.intel.com/2007/11/17/supercomputing-07-computer-environment-and-evolution/
)

How come there is almost unanimous agreement in the beowulf-community
while
the rest is almost unanimous convinced of the opposite ? Are we just
tapping
ourselves on the back or is MP not sufficiently dissiminated or ... ?

toon


!DSPAM:4759a800241507095717635!
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf


!DSPAM:4759a800241507095717635!



--
Doug
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to