On Thu, 06 Dec 2007 02:33:46 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote:
> From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> > Date: Thu, 6 Dec 2007 01:18:28 +0200 (EET) > > > On Wed, 5 Dec 2007, David Miller wrote: > > > > > I assume you're using something like carefully crafted printk's, > > > kprobes, or even ad-hoc statistic counters. That's what I used to do > > > :-) > > > > No, that's not at all what I do :-). I usually look time-seq graphs > > expect for the cases when I just find things out by reading code (or > > by just thinking of it). > > Can you briefly detail what graph tools and command lines > you are using? > > The last time I did graphing to analyze things, the tools > were hit-or-miss. > > > Much of the info is available in tcpdump already, it's just hard to read > > without graphing it first because there are some many overlapping things > > to track in two-dimensional space. > > > > ...But yes, I have to admit that couple of problems come to my mind > > where having some variable from tcp_sock would have made the problem > > more obvious. > > The most important are the cwnd and ssthresh, which you could guess > using graphs but it is important to know on a packet to packet > basis why we might have sent a packet or not because this has > rippling effects down the rest of the RTT. > > > Not sure what is the benefit of having distributions with it because > > those people hardly report problems anyway to here, they're just too > > happy with TCP performance unless we print something to their logs, > > which implies that we must setup a *_ON() condition :-(. > > That may be true, but if we could integrate the information with > tcpdumps, we could gather internal state using tools the user > already has available. > > Imagine if tcpdump printed out: > > 02:26:14.865805 IP $SRC > $DEST: . 11226:12686(1460) ack 0 win 108 > ss_thresh: 129 cwnd: 133 packets_out: 132 > > or something like that. > > > Some problems are simply such that things cannot be accurately verified > > without high processing overhead until it's far too late (eg skb bits vs > > *_out counters). Maybe we should start to build an expensive state > > validator as well which would automatically check invariants of the write > > queue and tcp_sock in a straight forward, unoptimized manner? That would > > definately do a lot of work for us, just ask people to turn it on and it > > spits out everything that went wrong :-) (unless they really depend on > > very high-speed things and are therefore unhappy if we scan thousands of > > packets unnecessarily per ACK :-)). ...Early enough! ...That would work > > also for distros but there's always human judgement needed to decide > > whether the bug reporter will be happy when his TCP processing does no > > longer scale ;-). > > I think it's useful as a TCP_DEBUG config option or similar, sure. > > But sometimes the algorithms are working as designed, it's just that > they provide poor pipe utilization and CWND analysis embedded inside > of a tcpdump would be one way to see that as well as determine the > flaw in the algorithm. > > > ...Hopefully you found any of my comments useful. > > Very much so, thanks. > > I put together a sample implementation anyways just to show the idea, > against net-2.6.25 below. > > It is untested since I didn't write the userland app yet to see that > proper things get logged. Basically you could run a daemon that > writes per-connection traces into files based upon the incoming > netlink events. Later, using the binary pcap file and these traces, > you can piece together traces like the above using the timestamps > etc. to match up pcap packets to ones from the TCP logger. > > The userland tools could do analysis and print pre-cooked state diff > logs, like "this ACK raised CWND by one" or whatever else you wanted > to know. > > It's nice that an expert like you can look at graphs and understand, > but we'd like to create more experts and besides reading code one > way to become an expert is to be able to extrace live real data > from the kernel's working state and try to understand how things > got that way. This information is permanently lost currently. Tools and scripts for testing that generate graphs are at: git://git.kernel.org/pub/scm/tcptest/tcptest -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html