Re: [Beowulf] Q: IB message rate & large core counts (per node)?

Patrick Geoffray Mon, 15 Mar 2010 13:32:40 -0700

Hi Richard,

I meant to reply earlier but got busy.


On 2/27/2010 11:17 PM, richard.wa...@comcast.net wrote:

If anyone finds errors in it please let me know so that I can fix
them.

You don't consider the protocol efficiency, and this is a major issue onPCIe.

First of all, I would change the labels "Raw" and "Effective" to"Signal" and "Raw". Then, I would add a third column "Effective" whichconsider the protocol overhead. The protocol overhead is the amount ofraw bandwidth that is not used for useful payload. On PCIe, on the Readside, the data comes in small packets with a 20 Bytes header (could be24 with optional ECRC) for a 64, 128 or 256 Bytes payload. Most PCIechipsets only support 64 Bytes Read Completions MTU, and even the onesthat support larger sizes would still use a majority of 64 Bytescompletions because it maps well to the transaction size on the memorybus (HT, QPI). With 64 Bytes Read Completions, the PCIe efficiency is64/84 = 76%, so 32 Gb/s becomes 24 Gb/s, which correspond to the heronumber quoted by MVAPICH for example (3 GB/s unidirectional).Bidirectional efficiency is a bit worse because PCIe Acks take some rawbandwidth too. They are coalesced but the pipeline is not very deep, soyou end up with roughly 20+20 Gb/s bidirectional.

There is a similar protocol efficiency at the IB or Ethernet level, butthe MTU is large enough that it's much smaller compared to PCIe.

Now, all of this does not matter because Marketers will keep usinguseless Signal rates. They will even have the balls to (try to) rewritehistory about packet rate benchmarks...


Patrick
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Q: IB message rate & large core counts (per node)?

Reply via email to