On Mon, 12 Nov 2007, Ilpo Järvinen wrote: > Yeah, it's more likely a miscount somewhere rather than corruption but > that wasn't obvious from the first mail... > > ...but alas, I haven't yet been able to come up with any theory on how > a miscount could occur....
Cancel that, first idea is presented in this patch (not sure if it's one that fixes your symptoms, but at least it seems a potential place where such thing could happen, no idea what events can cause that to occur though :-(): -- [PATCH] [TCP] FRTO: Plug potential LOST-bit leak It might be possible that, in some extreme scenario that I just cannot now construct in my mind, end_seq <= frto_highmark check does not match causing the lost_out and LOST bits become out-of-sync due to clearing and recounting in the loop. This may fix LOST-bit leak reported by Chazarain Guillaume <[EMAIL PROTECTED]>. Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- net/ipv4/tcp_input.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 23a0092..cc358d4 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1706,6 +1706,8 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag) tcp_for_write_queue(skb, sk) { if (skb == tcp_send_head(sk)) break; + + TCP_SKB_CB(skb)->sacked &= ~TCPCB_LOST; /* * Count the retransmission made on RTO correctly (only when * waiting for the first ACK and did not get it)... @@ -1719,7 +1721,7 @@ static void tcp_enter_frto_loss(struct sock *sk, int allowed_segments, int flag) } else { if (TCP_SKB_CB(skb)->sacked & TCPCB_RETRANS) tp->undo_marker = 0; - TCP_SKB_CB(skb)->sacked &= ~(TCPCB_LOST|TCPCB_SACKED_RETRANS); + TCP_SKB_CB(skb)->sacked &= ~TCPCB_SACKED_RETRANS; } /* Don't lost mark skbs that were fwd transmitted after RTO */ -- 1.5.0.6