From: Andrew Morton <[EMAIL PROTECTED]> Date: Sun, 31 Jul 2005 15:12:51 -0700
> I've been trying to upgrade kernel from 2.6.12.3 to 2.6.13-rc4 on a > rather loaded http server, but i'm currently having a kernel panic a few > minutes only after booting. The bug was reproductible (the crash > happened after every reboot, with the same backtrace). The two bug checks there are supposed to be impossible. I wonder how this can trigger other than do some bizarre memory corruption, but it's too precise a BUG() for it to be really something like that. The first check is tcp_skb_pcount() being not equal to one. The caller of tcp_tso_should_defer() (where the BUG() is triggering) looks like this: static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle) { ... tso_segs = tcp_init_tso_segs(sk, skb); ... while (likely(tcp_snd_wnd_test(tp, skb, mss_now))) { BUG_ON(!tso_segs); ... if (tso_segs == 1) { ... } else { if (tcp_tso_should_defer(sk, tp, skb)) break; } ... skb = sk->sk_send_head; if (!skb) break; tso_segs = tcp_init_tso_segs(sk, skb); } ... } So tso_segs is _always_ updated to be the tcp_skb_pcount(skb) value, and due to the "if (tso_segs == 1)" test it can never be "1" when we get to tcp_tso_should_defer(). That leaves the other branch of the assertion, namely: (tp->snd_cwnd <= in_flight) First, tcp_tso_should_defer() checks for the special case of the FIN bit being set, which causes us to return early and not get to the assertion check, like so: if (TCP_SKB_CB(skb)->flags & TCPCB_FLAG_FIN) return 0; that is the only exception to the "(tp->snd_cwnd <= in_flight)" rule. Next, the top level tcp_write_xmit() congestion window tracking looks like: static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle) { ... cwnd_quota = tcp_cwnd_test(tp, skb); if (unlikely(!cwnd_quota)) goto out; ... while (likely(tcp_snd_wnd_test(tp, skb, mss_now))) { ... if (unlikely(tcp_transmit_skb(sk, skb_clone(skb, GFP_ATOMIC)))) break; ... update_send_head(sk, tp, skb); ... cwnd_quota -= tcp_skb_pcount(skb); BUG_ON(cwnd_quota < 0); if (!cwnd_quota) break; } ... } 1) cwnd_quota is initialized to the value: (tp->snd_cwnd - tcp_packets_in_flight(tp)) at the top of tcp_write_xmit(), as long as this value is positive, else zero. 2) cwnd_quota is decremented by tcp_skb_pcount(skb) for every packet we send. 3) in parallel, tp->packets_out is incremented by tcp_skb_pcount(skb) as each packet goes out (via update_send_head()) Therefore, cwnd_quota must decrease exactly as much as tcp_packets_in_flight(tp) increases. This should therefore keep everything in check. There are no SMP issues as the socket is fully locked for this entire code path. In short, I'm stumped :-) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html