From: Andrew Morton <[EMAIL PROTECTED]>
Date: Sun, 31 Jul 2005 15:12:51 -0700
> I've been trying to upgrade kernel from 2.6.12.3 to 2.6.13-rc4 on a
> rather loaded http server, but i'm currently having a kernel panic a few
> minutes only after booting. The bug was reproductible (the crash
> happened after every reboot, with the same backtrace).
The two bug checks there are supposed to be impossible.
I wonder how this can trigger other than do some bizarre
memory corruption, but it's too precise a BUG() for it
to be really something like that.
The first check is tcp_skb_pcount() being not equal to one.
The caller of tcp_tso_should_defer() (where the BUG() is
triggering) looks like this:
static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
{
...
tso_segs = tcp_init_tso_segs(sk, skb);
...
while (likely(tcp_snd_wnd_test(tp, skb, mss_now))) {
BUG_ON(!tso_segs);
...
if (tso_segs == 1) {
...
} else {
if (tcp_tso_should_defer(sk, tp, skb))
break;
}
...
skb = sk->sk_send_head;
if (!skb)
break;
tso_segs = tcp_init_tso_segs(sk, skb);
}
...
}
So tso_segs is _always_ updated to be the tcp_skb_pcount(skb)
value, and due to the "if (tso_segs == 1)" test it can never
be "1" when we get to tcp_tso_should_defer().
That leaves the other branch of the assertion, namely:
(tp->snd_cwnd <= in_flight)
First, tcp_tso_should_defer() checks for the special case
of the FIN bit being set, which causes us to return early
and not get to the assertion check, like so:
if (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN)
return 0;
that is the only exception to the "(tp->snd_cwnd <= in_flight)"
rule.
Next, the top level tcp_write_xmit() congestion window tracking
looks like:
static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
{
...
cwnd_quota = tcp_cwnd_test(tp, skb);
if (unlikely(!cwnd_quota))
goto out;
...
while (likely(tcp_snd_wnd_test(tp, skb, mss_now))) {
...
if (unlikely(tcp_transmit_skb(sk, skb_clone(skb, GFP_ATOMIC))))
break;
...
update_send_head(sk, tp, skb);
...
cwnd_quota -= tcp_skb_pcount(skb);
BUG_ON(cwnd_quota < 0);
if (!cwnd_quota)
break;
}
...
}
1) cwnd_quota is initialized to the value:
(tp->snd_cwnd - tcp_packets_in_flight(tp))
at the top of tcp_write_xmit(), as long as this value
is positive, else zero.
2) cwnd_quota is decremented by tcp_skb_pcount(skb) for every
packet we send.
3) in parallel, tp->packets_out is incremented by tcp_skb_pcount(skb)
as each packet goes out (via update_send_head())
Therefore, cwnd_quota must decrease exactly as much as
tcp_packets_in_flight(tp) increases. This should therefore
keep everything in check.
There are no SMP issues as the socket is fully locked for this
entire code path.
In short, I'm stumped :-)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html