One question is why the RTO gets so large that it limits failover?

If Linux TCP is working correctly,  RTO should be srtt + 2*rttvar

So either there is a huge srtt or variance, or something is going
wrong with RTT estimation.  Given some reasonable maximums of
Srtt = 500ms and rttvar = 250ms, that would cause RTO to be 1second.

I suspect that what is happening here is that a link goes down in a trunk somewhere for some number of seconds, resulting in a given TCP segment being retransmitted several times, with the doubling of the RTO each time.

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to