Hi, Sean Hefty wrote: >> I am using IB as a cluster interconnect. If a node which had >> established several connections >> with a remote node was reset (not rebooted) and it came back up >> quickly is it possible for >> the node to get stale REQ/DREQ callbacks ? If yes, is there an API >> to purge stale states >> in the CM or should it be detected by the module getting the callback ? > > It's possible for stale REQ/DREQ messages to appear at the reset node, > but I don't see any problem with that occurring. The DREQs should be > dropped, since there's no connections to match them with. The REQs > should be rejected without a matching listen. If the listen occurs > before the REQ appears, then a new connection would result. I don't > see a problem in either case.
Our code isn't handling stale callbacks. Thanks for clarifying it. > > As for purging stale states, I'm not sure what you mean. The reset > node will have purged the local CM state. This is what I meant, but please note that I have yet to confirm the behaviour. If a node which has established several connections, - reboots (goes down and comes back up gracefully) then it seems there is no problem establishing connections the next time. - resets (goes down abruptly and comes back up) then it seems it is more likely to get stale callbacks from the CM. In the above scenario the node comes back up quickly in the reset case than the reboot case. So, i was just wondering if the extra delay in the reboot case was causing the problem to not occur. In other words, does the switch cache the reset node state and discards it after some fixed amount of time. Also, should a remote node with which the reset node had established connections call ib_destroy_cm_id() during its disconnect processing ? Currently, our code only destroys the QPs (by calling ib_destroy_cq() and ib_destroy_qp()). Thanks, Sreevatsa > > - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
