Patrick Geoffray wrote:
Greg Lindahl wrote:
On Wed, Jun 28, 2006 at 07:28:53AM -0400, Patrick Geoffray wrote:

I have keep it quiet even when you where saying things driven by
marketing rather than technical considerations (the packet per
second nonsense),

Patrick, that "packet per second nonsense" is the technical reason our
interconnect does so well. If you'd like to argue about it,
technically, I'd be happy to do so. No need to keep quiet.

My reservation was about the way you present it, not the technical idea behind. Actually, my real concern was that there was no technical content in your post, just references to white papers, ie marketing fluff.

An offer for "getting a secret white paper on request" is marketing, you are right. But at least the SPEC number was technical content - and we don't want to analyse every posting sentence-by-sentence, do we?

So, let's finally talk about the technical part. You claim that the key metric in your product is the messaging rate, ie the number of packets you can send per second. You even have a fancy name for it, something like Hyper Duper Messaging :-)

[...]

Let me summarize what I consider the key issues:
- explicit MPI_Irecv/MPI_Send/MPI_Wait, or similar patterns implicitely in MPI_Reduce/MPI_Alltoall/MPI_Allreduce with small messages (a few doubles, or a few kB) are the dominant communication pattern in many MPI applications. There are quite some (but not as many as one could wish) studies that show this. - This means it's generally a good thing if the "ping" latency (duration of MPI_Send in number of CPU cycles) is as low as possible. - At this message size, CPU utilization or overlapping computing and communication is not relevant, as (zero-copy) RDMA does not pay off until the message gets at least some (typically >32, or more) kB in size, due to the implied pinning and rendez-vous overhead. Also, MPI_Send has no opportunity for overlap, and having a progress thread on the receive CPU steal cycles from the application doesn't really help, neither. - In these cases, all(?) interconnects do some sort of memcpy() within MPI_Send to get rid of the data. The differences are * How long does it take to prepare things for the memcpy()? This is Greg's message rate.
 * When does the data arrive at the destination?
- But you never want to send millions of messages at once. This is micro-benchmarking at its best. It gives some indications, but seen alone, it is no prove for anything. - *If* you feel you need to use such a new metric for whatever reason, you should at least publish the benchmark that is used to gather these numbers to allow others to do comparative measurements. This goes to Greg.

But I don't think that Greg's "Real Appliation Performance" white paper is infamous. It states where the data comes from, you have to trust him for his own numbers, and it does not directly link the differences in the application performance to the messaging rate. Of course, it does not offer a scientific analysis, and you can not compare it to papers like the ones from Leonid Oliker. But I don't think it's unfair, and surely stimulates the competition for better technical solutions or better white papers.

--
Joachim - reply to joachim at domain ccrl-nece dot de

Opinion expressed is personal and does not constitute
an opinion or statement of NEC Laboratories.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to