At 21:00 20.04.2007, "Steffen Persvold" <[EMAIL PROTECTED]> wrote:
So I'm guessing, both Myrinet MX and Qlogic Infinipath (confirmed) is
using PIO for "small" messages. Are we sure that Mellanox ConnectX
doesn't ? It seems they would have to in order to get the 1.2us numbers.
There's nothing that stops them from doing :

verbs_post_rdma_write() {
...
    if (msg_size < MAX_PIO_TRESHOLD) {
        copybuffertoremotewithpio();
    } else {
        setupdmaengine();
    }
...
}


PIO is a term with an two different interpretations. For a shared address space NIC, such as Dolphin's SCI adapters, PIO implies a sender CPU to write data directly into the user space of a remote process on a remote node. The cluster interconnect emulates a PCI to PCI bridge in this case. On other NICs, PIO implies using the processor to transmit the DMA description and the data to the local NIC. Then the local NIC issues a DMA to transmit the data/message to the remote node from a local buffer on the NIC. The main point is the local NIC doesn't have to issue a DMA read to local memory in order to read the DMA descriptor and data.

So, when Mellanox reduces the latency from around 4 to around 1 usec, I assume they have modified the hardware-software interface of their HCA to enable PIO mode send operations, where DMA descriptor+data is transmitted on the PCI(e) bus using a single WC bus tenure. I haven't used a PCI analyzer on their HCAs, but a thumb of rule is that every I/O operation to a NIC takes in the order of 1usec. So may be they have managed to go from 3 to one I/O operation in order to kick off a transfer. Pure speculation fro my side though.


Håkon



--
Håkon Bugge
CTO
dir. +47 22 62 89 72
mob. +47 92 48 45 14
fax. +47 22 62 89 51
[EMAIL PROTECTED]
Skype: hakon_bugge

Scali - http://www.scali.com
Scaling the Linux Datacenter


_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to