From: "Ian McDonald" <[EMAIL PROTECTED]>
Subject: [MaybeSpam] Re: [PATCH 2.6.22-rc5] TCP: Make TCP_RTO_MAX a variable
Date: Tue, 26 Jun 2007 10:18:46 +1200

> On 6/26/07, OBATA Noboru <[EMAIL PROTECTED]> wrote:
> > From: OBATA Noboru <[EMAIL PROTECTED]>
> >
> > Make TCP_RTO_MAX a variable, and allow a user to change it via a
> > new sysctl entry /proc/sys/net/ipv4/tcp_rto_max.  A user can
> > then guarantee TCP retransmission to be more controllable, say,
> > at least once per 10 seconds, by setting it to 10.  This is
> > quite helpful on failover-capable network devices, such as an
> > active-backup bonding device.  On such devices, it is desirable
> > that TCP retransmits a packet shortly after the failover, which
> > is what I would like to do with this patch.  Please see
> > Background and Problem below for rationale in detail.
> >
> RFC2988 says this:
>    (2.4) Whenever RTO is computed, if it is less than 1 second then the
>          RTO SHOULD be rounded up to 1 second.
> 
>          Traditionally, TCP implementations use coarse grain clocks to
>          measure the RTT and trigger the RTO, which imposes a large
>          minimum value on the RTO.  Research suggests that a large
>          minimum RTO is needed to keep TCP conservative and avoid
>          spurious retransmissions [AP99].  Therefore, this
>          specification requires a large minimum RTO as a conservative
>          approach, while at the same time acknowledging that at some
>          future point, research may show that a smaller minimum RTO is
>          acceptable or superior.
> 
>    (2.5) A maximum value MAY be placed on RTO provided it is at least 60
>          seconds.
> 
> Your code doesn't seem to meet requirements of section 2.5 as your
> minimum is 1 second.
> 
> I think if you're trying to solve the bonding issue then you should
> solve that issue, not hack the TCP implementation as that opens it up
> to abuse in other ways.

I think this is rather a new problem, or requirement, in the
combined case "TCP on a failover-capable network device," and
not easily solved only by bonding.

A notify mechanism from bonding to TCP is suggested, but I think
it is really hard to do it in the virtualized environment like
Xen.  Hypervisor (Dom-0) takes care of physical devices,
including bonding, and guests (Dom-U) handle TCP.  Notifying
from bonding in Dom-0 to TCP in Dom-U is really a challenge.

My problem (TCP retransmission may not be done in the expected
time frame, e.x., 10 seconds after a bonding failover) still
occurs in such an environment, and my code (capping TCP_RTO_MAX)
still works on VM environment.

So solving this in TCP layer makes sense to me.

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to