From: "Ian McDonald" <[EMAIL PROTECTED]> Subject: [MaybeSpam] Re: [PATCH 2.6.22-rc5] TCP: Make TCP_RTO_MAX a variable Date: Tue, 26 Jun 2007 10:18:46 +1200
> On 6/26/07, OBATA Noboru <[EMAIL PROTECTED]> wrote: > > From: OBATA Noboru <[EMAIL PROTECTED]> > > > > Make TCP_RTO_MAX a variable, and allow a user to change it via a > > new sysctl entry /proc/sys/net/ipv4/tcp_rto_max. A user can > > then guarantee TCP retransmission to be more controllable, say, > > at least once per 10 seconds, by setting it to 10. This is > > quite helpful on failover-capable network devices, such as an > > active-backup bonding device. On such devices, it is desirable > > that TCP retransmits a packet shortly after the failover, which > > is what I would like to do with this patch. Please see > > Background and Problem below for rationale in detail. > > > RFC2988 says this: > (2.4) Whenever RTO is computed, if it is less than 1 second then the > RTO SHOULD be rounded up to 1 second. > > Traditionally, TCP implementations use coarse grain clocks to > measure the RTT and trigger the RTO, which imposes a large > minimum value on the RTO. Research suggests that a large > minimum RTO is needed to keep TCP conservative and avoid > spurious retransmissions [AP99]. Therefore, this > specification requires a large minimum RTO as a conservative > approach, while at the same time acknowledging that at some > future point, research may show that a smaller minimum RTO is > acceptable or superior. > > (2.5) A maximum value MAY be placed on RTO provided it is at least 60 > seconds. > > Your code doesn't seem to meet requirements of section 2.5 as your > minimum is 1 second. > > I think if you're trying to solve the bonding issue then you should > solve that issue, not hack the TCP implementation as that opens it up > to abuse in other ways. I think this is rather a new problem, or requirement, in the combined case "TCP on a failover-capable network device," and not easily solved only by bonding. A notify mechanism from bonding to TCP is suggested, but I think it is really hard to do it in the virtualized environment like Xen. Hypervisor (Dom-0) takes care of physical devices, including bonding, and guests (Dom-U) handle TCP. Notifying from bonding in Dom-0 to TCP in Dom-U is really a challenge. My problem (TCP retransmission may not be done in the expected time frame, e.x., 10 seconds after a bonding failover) still occurs in such an environment, and my code (capping TCP_RTO_MAX) still works on VM environment. So solving this in TCP layer makes sense to me. Regards, -- OBATA Noboru ([EMAIL PROTECTED]) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html