Re: [PATCH] tcp: bigger congestion window for loopback

Rick Jones Fri, 10 Mar 2006 17:16:40 -0800

Stephen Hemminger wrote:

On Fri, 10 Mar 2006 16:16:07 -0800
Rick Jones <[EMAIL PROTECTED]> wrote:
I would have thought that byte based growth of the CWND would have meantthat the ACK's above would have allowed more bytes to flow, yet morebytes are not flowing. That makes it seem like cwnd isn't strictlybytes, but is also tracked in terms of number of outstanding segments.
Linux cwnd is in packets.
How is the ABC cwnd of bytes mapped to packets? Does it only go up byone packet after an MSS has been ACKed then?
/*
 * Linear increase during slow start
 */
void tcp_slow_start(struct tcp_sock *tp)
{
        if (sysctl_tcp_abc) {


ah, so there is a sysctl to turn this off :)

                /* RFC3465: Slow Start
                 * TCP sender SHOULD increase cwnd by the number of
                 * previously unacknowledged bytes ACKed by each incoming
                 * acknowledgment, provided the increase is not more than L
                 */
                if (tp->bytes_acked < tp->mss_cache)
                        return;

And only increasing cwnd after a full mss has been acked. Which IIRC isnot part of the ABC RFC.


                /* We MAY increase by 2 if discovered delayed ack */
                if (sysctl_tcp_abc > 1 && tp->bytes_acked > 2*tp->mss_cache) {
                        if (tp->snd_cwnd < tp->snd_cwnd_clamp)
                                tp->snd_cwnd++;
                }
        }
        tp->bytes_acked = 0;

        if (tp->snd_cwnd < tp->snd_cwnd_clamp)
                tp->snd_cwnd++;
}

Think of congestion window as measurement of the available sewer pipe.
If everyone thinks the congestion window is too large, then the sewer pipe
would back up and nothing would overflow.

Small packets are like a leaky faucet dripping, just because a drip goes
down the drain doesn't tell you much about the available pipe diameter.
I agree that if I can have five drips outstanding I should not be ableto then put five buckets out there, but should I have to exchangeanother 1460 drips before I can have six drips outstanding?



The drips count for nothing as far as congestion is concerned when
we need to count toilet bowls (enough with this analogy)...


I didn't take us here :)

I got the impression that ABC was written with a byte cwnd in mind not apacket cwnd, which makes me wonder if the mapping to a packet cwnd aboveis really "correct?" I really don't think that ABC or any of the cwndstuff really meant that to be able to go from five single-byte packetsat one time to six one had to send 1460 single byte packets first. Andthis application is caught in the middle of an attempt to map byte cwndswith packet cwnds.

IIRC all (most of) the RFC's talk about the cwnd in bytes because at thetime VJ did his work (in hnits of packets/segments) none of the (common-MPE did :) stacks actually knew how many segments they had outstandingat any one time. So, we have the "increase by an MSS on each ACK"heuristic - it didn't overly penalize bulik transfers. It was a proxyfor tracking segments, and with the existence of the ABC RFC we canassume not all that good a proxy.

In the original VJ paper, when a packet was known to have left thenetwork, the stack was free to replace it and add another. The queuesin the network are (as near as I can tell) in units of packets, not inunits of bytes.

But in the code above, it is doing cwnd in packets but being _really_conservative in a "conservation of packets" sense, by only increasingthe packet cwnd by one after a full MSS has been acked. That is much,Much, MUCH more conservative than the original heuristic. And much muchmore conservative than I think ABC was looking to be.

So, seems we don't want too many packets out there, but we also don'twant too many bytes out there, which seems to mean there needs to be twocwnds - a packet cwnd and a byte cwnd. Packet cwnd increases based onknowledge of packets having left the network, bytecwnd based on how manybytes have left the network. Then the small packet application can getits cwnd grown in reasonable time, and still not be able to dump aboatload of bytes onto the network, and the large packet applicationwill get its cwnd grown in a reasonable time and still not be able togenerate some massive spike of small packets onto the network.

Admittedly, this specific application is a bad client for the case I'mtrying to make, but if it were properly putting messages to thetransport in one call, but trying to have five of them outstanding at atime I get the impression it would be a very long time before it couldget all five outstanding. I guess netperf TCP_RR with configured with--enable-burst would be one way to check that.
The other problem this application has is that by the time it builds up
enough bytes acked to open the congestion window, it goes back to sleep for
a long enough time for the window to be restarted.

Figures. Can't say as I've ever really liked restarting slow startafter idle to begin with. But that would be an entirely different discussion


rick
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] tcp: bigger congestion window for loopback

Reply via email to