Re: [Beowulf] many cores and ib

Patrick Geoffray Tue, 06 May 2008 01:55:23 -0700

Gilad Shainer wrote:

It is the same benchmark that QLogic were and are using for MPI message
rate, and I guess you know that better then me, don't you?....  I want
to make sure when one do a comparison he/she will be using the same

benchmark/output to compare.

It is not the benchmark, it's the MPI implementation. The benchmark initself is stupid, because it sends a gazillion messages to a singlenode. The MPI implementation is dishonest, because it says "eh, you aretrying to send a gazillion messages to a single node, let me pack theminto a single message on the wire for you", completely changing what thebenchmark is trying to measure.

You are a marketing guy, you just repeat the numbers withoutunderstanding what they mean. Message coalescing in MVAPICH does nothingbut make the message rate micro-benchmark irrelevant, it was designedthat way, and only for that purpose. With message coalescing,*everybody* can send 20 Million messages per second, as long as you haveover 1GB/s of bandwidth.

This is like the header caching "optimization": change the MPI tag foreach Send in your pingpong benchmark, and see your latency goes up. It'sbecause the MPI implementation is smart enough to say "eh, you aresending the same message envelope over and over, let me compact the MPIheader for you". It does not help anything but a micro-benchmark.

I can imagine the next optimization from here: if you happen to sendmessages full of zeros in your ping-pong, MVAPICH will "compress" themfor you. And somewhere, someone will claim a gazillion bytes per second...


Patrick
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] many cores and ib

Reply via email to