Hi Eric! Thanks for the direction. I tried packet drill locally (with the same kernel Linux 3.18.5 to start with) with the following script. And it doesn’t show the problem I mentioned. So the fast retransmit happens after getting the dupack. It would be good if I could get some information from the calls from the TCP stack (I have some printk there), but using packet drill I don’t know at the moment, how to get that.
\ Mohammad // Establish a connection. 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 setsockopt(3, SOL_SOCKET, TCP_NODELAY, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.03 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 1 data segment and get an ACK with DATA +0 write(4, ..., 1000) = 1000 +0 > P. 1:1001(1000) ack 1 +.03 < P. 1:11(10) ack 1001 win 257 +0 > . 1001:1001(0) ack 11 //+0.1 read(3,...,1000)=10 +0 write(4, ..., 1000) = 1000 +0 > P. 1001:2001(1000) ack 11 +.03 < P. 11:21(10) ack 2001 win 257 //+0 > . 2001:2001(0) ack 21 +0 write(4, ..., 1000) = 1000 +0 > P. 2001:3001(1000) ack 21 +.03 < P. 21:31(10) ack 3001 win 257 +0 write(4, ..., 1000) = 1000 +0 > P. 3001:4001(1000) ack 31 +0.2 write(4, ..., 1000) = 1000 +0 > P. 4001:5001(1000) ack 31 +0.03 > P. 4001:5001(1000) ack 31 +0.04 < P. 21:31(10) ack 3001 win 257 +0 > . 5001:5001(0) ack 31 <nop,nop,sack 21:31> +0.03 < . 31:31(0) ack 3001 win 257 <nop,nop,sack 4001:5001> +0.003 < . 31:31(0) ack 3001 win 257 <nop,nop,sack 4001:5001> +0.006 > P. 3001:4001(1000) ack 31 1 0.000000 192.0.2.1 -> 192.168.0.1 TCP 68 60262 > http-alt [SYN] Seq=0 Win=32792 Len=0 MSS=1000 SACK_PERM=1 WS=128 2 0.000068 192.168.0.1 -> 192.0.2.1 TCP 68 http-alt > 60262 [SYN, ACK] Seq=0 Ack=1 Win=29200 Len=0 MSS=1460 SACK_PERM=1 WS=512 3 0.030294 192.0.2.1 -> 192.168.0.1 TCP 56 60262 > http-alt [ACK] Seq=1 Ack=1 Win=32896 Len=0 4 0.030370 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP segment of a reassembled PDU] 5 0.060474 192.0.2.1 -> 192.168.0.1 TCP 66 [TCP segment of a reassembled PDU] 6 0.060507 192.168.0.1 -> 192.0.2.1 TCP 56 http-alt > 60262 [ACK] Seq=1001 Ack=1 Win=29696 Len=0 7 0.060670 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP segment of a reassembled PDU] 8 0.090766 192.0.2.1 -> 192.168.0.1 TCP 66 60262 > http-alt [PSH, ACK] Seq=1 Ack=2001 Win=32896 Len=10 9 0.090809 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP segment of a reassembled PDU] 10 0.120984 192.0.2.1 -> 192.168.0.1 TCP 66 [TCP segment of a reassembled PDU] 11 0.121026 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP segment of a reassembled PDU] 12 0.321111 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP segment of a reassembled PDU] 13 0.351588 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP Retransmission] [TCP segment of a reassembled PDU] 14 0.391668 192.0.2.1 -> 192.168.0.1 TCP 66 [TCP Retransmission] [TCP segment of a reassembled PDU] 15 0.391699 192.168.0.1 -> 192.0.2.1 TCP 68 [TCP Dup ACK 13#1] http-alt > 60262 [ACK] Seq=5001 Ack=21 Win=29696 Len=0 SLE=11 SRE=21 16 0.421888 192.0.2.1 -> 192.168.0.1 TCP 68 [TCP Dup ACK 14#1] 60262 > http-alt [ACK] Seq=21 Ack=3001 Win=32896 Len=0 SLE=4001 SRE=5001 17 0.424964 192.0.2.1 -> 192.168.0.1 TCP 68 [TCP Dup ACK 14#2] 60262 > http-alt [ACK] Seq=21 Ack=3001 Win=32896 Len=0 SLE=4001 SRE=5001 18 0.431597 192.168.0.1 -> 192.0.2.1 TCP 1056 [TCP Fast Retransmission] [TCP segment of a reassembled PDU] > On 01 Sep 2015, at 14:31, Eric Dumazet <eric.duma...@gmail.com> wrote: > > On Tue, 2015-09-01 at 11:36 +0200, Mohammad Rajiullah wrote: >> Hi! >> >> While measuring TLP’s performance for an online gaming scenario, where both >> the client and the server send data, TLP >> shows unexpected loss recovery in Linux 3.18.5 kernel. Early retransmit >> fails in response >> to the dupack which is later resolved using RTO. I found the behaviour >> consistent during the whole measurement period. >> Following is an excerpt from the tcpdump traces (taken at the server) >> showing the behaviour: >> >> 0.733965 Client -> Server HTTP 431 POST /Scores HTTP/1.1 >> 0.738355 Server -> Client HTTP 407 HTTP/1.1 200 OK >> 0.985346 Server -> Client TCP 68 [TCP segment of a reassembled PDU] >> 0.993322 Client -> Server HTTP 431 [TCP Retransmission] POST /Scores >> HTTP/1.1 >> 0.993352 Server -> Client TCP 78 [TCP Dup ACK 2339#1] 8081→45451 [ACK] >> Seq=186995 Ack=230031 Len=0 SLE=229666 SRE=230031 >> 1.089327 Server -> Client TCP 68 [TCP Retransmission] 8081→45451 [PSH, >> ACK] Seq=186993 Ack=230031 Len=2 >> 1.294816 Client -> Server TCP 78 [TCP Dup ACK 2340#1] 45451→8081 >> [ACK] Seq=230031 Ack=186652 Len=0 SLE=186993 SRE=186995 >> 1.295018 Client -> Server TCP 86 [TCP Dup ACK 2340#2] 45451→8081 >> [ACK] Seq=230031 Ack=186652 Len=0 SLE=186993 SRE=186995 SLE=186993 >> SRE=186995 >> 1.541328 Server -> Client HTTP 407 [TCP Retransmission] HTTP/1.1 200 OK >> >> From some kernel debug info (using printk ..) it appears that for some >> reason although the incoming dupack >> starts the early retransmit delay timer, it never expires. The above >> measurement was taken in a >> wireless environment. I also recreated the scenario in a wired network with >> synthetic traffic to have regular >> RTTs. The behaviour remains the same. >> 0.287241 Client -> Server TCP 316 58148 > colubris [PSH, ACK] >> Seq=251 Ack=501 Win=31744 Len=250 TSval=98871521 TSecr=98865126 >> 0.287278 Server -> Client TCP 316 colubris > 58148 [PSH, ACK] >> Seq=501 Ack=501 Win=31232 Len=250 TSval=98865134 TSecr=98871521 >> 0.515351 Server -> Client TCP 316 colubris > 58148 [PSH, ACK] >> Seq=751 Ack=501 Win=31232 Len=250 TSval=98865191 TSecr=98871521 >> 0.518003 Client -> Server TCP 316 [TCP Retransmission] 58148 > >> colubris [PSH, ACK] Seq=251 Ack=501 Win=31744 Len=250 TSval=98871579 >> TSecr=98865126 >> 0.518021 Server -> Client TCP 78 [TCP Dup ACK 12#1] colubris > 58148 >> [ACK] Seq=1001 Ack=501 Win=31232 Len=0 TSval=98865191 TSecr=98871579 SLE=251 >> SRE=501 >> 0.518798 Server -> Client TCP 316 [TCP Retransmission] colubris > >> 58148 [PSH, ACK] Seq=751 Ack=501 Win=31232 Len=250 TSval=98865192 >> TSecr=98871579 >> 0.544700 Client -> Server TCP 78 [TCP Window Update] 58148 > >> colubris [ACK] Seq=501 Ack=501 Win=32768 Len=0 TSval=98871585 TSecr=98865126 >> SLE=751 SRE=1001 >> 0.549653 Client -> Server TCP 86 [TCP Dup ACK 16#1] 58148 > colubris >> [ACK] Seq=501 Ack=501 Win=32768 Len=0 TSval=98871586 TSecr=98865126 SLE=751 >> SRE=1001 SLE=751 SRE=1001 >> 0.778802 Server -> Client TCP 316 [TCP Retransmission] colubris > >> 58148 [PSH, ACK] Seq=501 Ack=501 Win=31232 Len=250 TSval=98865257 >> TSecr=98871586 >> > > Hello Mohammad > > It would be nice you reproduce the problem with packetdrill and possibly > using a more recent kernel. > > Having a packetdrill test is easier to demonstrate the problem and > testing a fix if needed. > > Thanks ! > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html