Hi Ilpo,

Thanks for your reply. 
Most of my response are in the reply to patch 1/2.

From: "Ilpo Järvinen" <[EMAIL PROTECTED]>
Subject: Re: [RFC][PATCH 2/2] TCP: skip processing cached SACK blocks
Date: Fri, 5 Oct 2007 13:37:21 +0300 (EEST)

> On Thu, 4 Oct 2007, TAKANO Ryousei wrote:
> 
> > This patch allows to process only newly reported SACK blocks at the
> > sender side. An ACK packet contains up to three SACK blocks, and some
> 
>   "A SACK option that specifies n blocks will have a length of 8*n+2
>    bytes, so the 40 bytes available for TCP options can specify a
>    maximum of 4 blocks.  It is expected that SACK will often be used in
>    conjunction with the Timestamp option used for RTTM [Jacobson92],
>    which takes an additional 10 bytes (plus two bytes of padding); thus
>    a maximum of 3 SACK blocks will be allowed in this case." [RFC2018]
> 
> :-)
> 
Yes, indeed:-)

> > of them may be already reported and processed blocks.  This patch 
> > prevents processing of such already processed SACK blocks.
> > 
> > Signed-off-by: Ryousei Takano <[EMAIL PROTECTED]>
> > Signed-off-by: Yuetsu Kodama <[EMAIL PROTECTED]>
> > ---
> >  net/ipv4/tcp_input.c |   24 ++++++++++++++++++++++++
> >  1 files changed, 24 insertions(+), 0 deletions(-)
> > 
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index bbad2cd..9615fc9 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -978,6 +978,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff 
> > *ack_skb, u32 prior_snd_
> >     int cached_fack_count;
> >     int i;
> >     int first_sack_index;
> > +   u8 sack_block_skip[4] = {0,0,0,0};
> >  
> >     if (!tp->sacked_out)
> >             tp->fackets_out = 0;
> > @@ -1012,6 +1013,21 @@ tcp_sacktag_write_queue(struct sock *sk, struct 
> > sk_buff *ack_skb, u32 prior_snd_
> >     if (before(TCP_SKB_CB(ack_skb)->ack_seq, prior_snd_una - 
> > tp->max_window))
> >             return 0;
> >  
> > +   /* Skip processing cached SACK blocks. */
> > +   for (i = 0; i < num_sacks; i++) {
> > +           __be32 start_seq = sp[i].start_seq;
> > +           __be32 end_seq = sp[i].end_seq;
> > +           int j;
> > +
> > +           for (j = 0; j < ARRAY_SIZE(tp->recv_sack_cache); j++) {
> > +                   if ((tp->recv_sack_cache[j].start_seq == start_seq) &&
> > +                       (tp->recv_sack_cache[j].end_seq == end_seq)) {
> > +                           sack_block_skip[i] = 1;
> > +                           break;
> > +                   }
> > +           }
> > +   }
> > +
> 
> I'm somewhat against adding more and more special cases to sacktag, 
> there's still need for more special cases after this one to avoid very 
> expensive processing (I guess they just won't occur in your scenario)!
> ...I would rather remove whole special case mess of the fastpath and
> have a more generic solution (see the patch I point into in the reply
> to patch 1/2)...
> 
> >     /* SACK fastpath:
> >      * if the only SACK change is the increase of the end_seq of
> >      * the first block then only apply that SACK block
> > @@ -1051,11 +1067,16 @@ tcp_sacktag_write_queue(struct sock *sk, struct 
> > sk_buff *ack_skb, u32 prior_snd_
> >                             if (after(ntohl(sp[j].start_seq),
> >                                       ntohl(sp[j+1].start_seq))){
> >                                     struct tcp_sack_block_wire tmp;
> > +                                   u8 sbtmp;
> >  
> >                                     tmp = sp[j];
> >                                     sp[j] = sp[j+1];
> >                                     sp[j+1] = tmp;
> >  
> > +                                   sbtmp = sack_block_skip[j];
> > +                                   sack_block_skip[j] = 
> > sack_block_skip[j+1];
> > +                                   sack_block_skip[j+1] = sbtmp;
> > +
> >                                     /* Track where the first SACK block 
> > goes to */
> >                                     if (j == first_sack_index)
> >                                             first_sack_index = j+1;
> > @@ -1083,6 +1104,9 @@ tcp_sacktag_write_queue(struct sock *sk, struct 
> > sk_buff *ack_skb, u32 prior_snd_
> >             int fack_count;
> >             int dup_sack = (found_dup_sack && (i == first_sack_index));
> >  
> > +           if (sack_block_skip[i])
> 
> DSACKs must always be processed, so please add:
> 
> && !dup_sack
> 
I did not notice DSACKs. Thanks.

> > +                   continue;
> 
> By doing this skipping here, you actually end up crippling lost_retrans 
> detection even more than it was broken before. ...You probably didn't just 
> notice that during tests because of unrelated suboptimal behavior (in 
> fastpath_skb_hint handling). ...Anyway, correctness of this should be 
> evaluated against the fixed lost_retrans, rather than the already 
> broken one.
> 
You can find the result that the average goodput slightly improves
against the only PATCH #1 (fixed lost_retrans) applied kernel.
Sorry, our web page is down this weekend for the power outage.

http://projects.gtrc.aist.go.jp/gnet/sack-bug.html

> > +
> >             skb = cached_skb;
> >             fack_count = cached_fack_count;
> 
> Other than what's noted above:
> 
> Acked-by: Ilpo Järvinen <[EMAIL PROTECTED]>
> 
> 
> -- 
>  i.

Regards,
Ryousei Takano
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to