At 09:37 PM 2/14/2007, Devesh Sharma wrote: >On 2/14/07, Michael Krause <[EMAIL PROTECTED]> wrote: >>At 05:37 AM 2/13/2007, Devesh Sharma wrote: >> >On 2/12/07, Devesh Sharma <[EMAIL PROTECTED]> wrote: >> >>On 2/10/07, Tang, Changqing <[EMAIL PROTECTED]> wrote: >> >> > > > >> >> > > >Not for the receiver, but the sender will be severely slowed down by >> >> > > >having to wait for the RNR timeouts. >> >> > > >> >> > > RNR = Receiver Not Ready so by definition, the data flow >> >> > > isn't going to >> >> > > progress until the receiver is ready to receive data. If a >> >> > > receive QP >> >> > > enters RNR for a RC, then it is likely not progressing as >> >> > > desired. RNR >> >> > > was initially put in place to enable a receiver to create >> >> > > back pressure to the sender without causing a fatal error >> >> > > condition. It should rarely be entered and therefore should >> >> > > have negligible impact on overall performance however when a >> >> > > RNR occurs, no forward progress will occur so performance is >> >> > > essentially zero. >> >> > >> >> > Mike: >> >> > I still do not quite understand this issue. I have two >> >> > situations that have RNR triggered. >> >> > >> >> > 1. process A and process B is connected with QP. A first post a send to >> >> > B, B does not post receive. Then A and B are doing a long time >> >> > RDMA_WRITE each other, A and B just check memory for the RDMA_WRITE >> >> > message. Finally B will post a receive. Does the first pending send >> in A >> >> > block all the later RDMA_WRITE ? >> >>According to IBTA spec HCA will process WR entries in strict order in >> >>which they are posted so the send will block all WR posted after this >> >>send, Until-unless HCA has multiple processing elements, I think even >> >>then processing order will be maintained by HCA >> >> If not, since RNR is triggered >> >> > periodically till B post receive, does it affect the RDMA_WRITE >> >> > performance between A and B ? >> >> > >> >> > 2. extend above to three processes, A connect to B, B connect to C, >> so B >> >> > has two QPs, but one CQ.A posts a send to B, B does not post receive, >> >post ordering accross QP is not guaranteed hence presence of same CQ >> >or different CQ will not affect any thing. >> >> > rather B and C are doing a long time RDMA_WRITE,or send/recv. But B >> >If RDMA WRITE _on_ B, no effect on performance. If RDMA WRITE _on_ C, >I am sorry I have missed that in both cases same DMA channel is in use. >> >_may_ affect the performance, since load is on same HCA. In case of >> >Send/Recv again _may_ affect the performance, with the same reason. >> >>Seems orthogonal. Any time h/w is shared, multiple flows will have an >>impact on one another. That is why we have the different arbitration >>mechanisms to enable one to control that impact. >Please, can you explain it more clearly?
Most I/O devices are shared by multiple applications / kernel subsystems. Hence, the device acts as a serialization point for what goes on the wire / link. Sharing = resource contention and in order to add any structure to that contention, a number of technologies provide arbitration options. In the case of IB, the arbitration is confined to VL arbitration where a given data flow is assigned to a VL and that VL is services at some particular rate. A number of years ago I wrote up how one might also provide QP arbitration (not part of the IBTA specifications) and I understand some implementations have incorporated that or a variation of the mechanisms into their products. In addition to IB link contention, there is also PCI link / bus contention. For PCIe, given most designs did not want to waste resources on multiple VC, there really isn't any standard arbitration mechanism. However, many devices, especially a device like a HCA or a RNIC, already have the concept of separate resource domains, e.g. QP, and they provide a mechanism to associate how the QP's DMA requests or interrupts requests are scheduled to the PCIe link. >> >> > must sends RNR periodically to A, right?. So does the pending message >> >> > from A affects B's overall performance between B and C ? >> >But RNR NAK is not for very long time.....possibly this performance >> >hit you will not be able to observe even. The moment rnr_counter >> >expires connection will be broken! >> >>Keep in mind the timeout can be infinite. RNR NAK are not expected to be >>frequent so their performance impact was considered reasonable. >Thanks I missed that. It is a subtlety within the specification that is easy to miss. Mike _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
