Eric Dumazet wrote on Fri, Apr 13, 2018:
> That might be caused by some TS val/ecr breakage :
> 
> Many acks were received by the server tcpdump,
> but none of them was accepted by TCP stack, for some reason.
> 
> Try to disable TCP timestamps, it will give some hint if bug does not 
> reproduce.

This was spot on, after disabling tcp timestamps I cannot reproduce the
hang anymore.

I've had another look at the original sequence (as seen by the server)
and I don't see much wrong; tell me what I missed:
 - the replayed packet has seq 32004:33378, so the first ignored ack
would be the one with ack 33378, is that right? (meaning the server did
accept the one for 32004 and none after that)

Assuming it is, excerpt from around then (first emission of that packet then
client replies):
16:49:26.700531 IP <server>.13317 > <client>.31872: Flags [.], seq 32004:33378, 
ack 4190, win 307, options [nop,nop,TS val 1313937607 ecr 1617129440], length 
1374
...
16:49:26.728084 IP <client>.31872 > <server>.13317: Flags [.], ack 32004, win 
759, options [nop,nop,TS val 1617129473 ecr 1313937602], length 0
...
16:49:26.729531 IP <client>.31872 > <server>.13317: Flags [.], ack 33378, win 
782, options [nop,nop,TS val 1617129475 ecr 1313937607], length 0
16:49:26.730002 IP <client>.31872 > <server>.13317: Flags [.], ack 34752, win 
805, options [nop,nop,TS val 1617129475 ecr 1313937607], length 0
...
16:49:26.731634 IP <client>.31872 > <server>.13317: Flags [.], ack 36126, win 
827, options [nop,nop,TS val 1617129476 ecr 1313937607], length 0


 - the ecr value matches the val of the packet it acks
 - the val is >= that of previous packet (won't be considered
reorder/should pass paws check?)
 - even if the packets are processed in parallel and some kind of race
occurs, a "bigger" ack should ack all the previous packets, right?

 - Just to make sure, I checked /proc/net/netstat for PAWSEstab but that
is 0:
TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts 
PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps ArpFilter TW 
TWRecycled TWKilled PAWSActive PAWSEstab DelayedACKs DelayedACKLocked 
DelayedACKLost ListenOverflows ListenDrops TCPHPHits TCPPureAcks TCPHPAcks 
TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder TCPSACKReorder 
TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo TCPDSACKUndo TCPLossUndo 
TCPLostRetransmit TCPRenoFailures TCPSackFailures TCPLossFailures 
TCPFastRetrans TCPSlowStartRetrans TCPTimeouts TCPLossProbes 
TCPLossProbeRecovery TCPRenoRecoveryFail TCPSackRecoveryFail TCPRcvCollapsed 
TCPDSACKOldSent TCPDSACKOfoSent TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnData 
TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger 
TCPAbortFailed TCPMemoryPressures TCPMemoryPressuresChrono TCPSACKDiscard 
TCPDSACKIgnoredOld TCPDSACKIgnoredNoUndo TCPSpuriousRTOs TCPMD5NotFound 
TCPMD5Unexpected TCPMD5Failure TCPSackShifted TCPSackMerged 
TCPSackShiftFallback TCPBacklogDrop PFMemallocDrop TCPMinTTLDrop 
TCPDeferAcceptDrop IPReversePathFilter TCPTimeWaitOverflow TCPReqQFullDoCookies 
TCPReqQFullDrop TCPRetransFail TCPRcvCoalesce TCPOFOQueue TCPOFODrop 
TCPOFOMerge TCPChallengeACK TCPSYNChallenge TCPFastOpenActive 
TCPFastOpenActiveFail TCPFastOpenPassive TCPFastOpenPassiveFail 
TCPFastOpenListenOverflow TCPFastOpenCookieReqd TCPFastOpenBlackhole 
TCPSpuriousRtxHostQueues BusyPollRxPackets TCPAutoCorking TCPFromZeroWindowAdv 
TCPToZeroWindowAdv TCPWantZeroWindowAdv TCPSynRetrans TCPOrigDataSent 
TCPHystartTrainDetect TCPHystartTrainCwnd TCPHystartDelayDetect 
TCPHystartDelayCwnd TCPACKSkippedSynRecv TCPACKSkippedPAWS TCPACKSkippedSeq 
TCPACKSkippedFinWait2 TCPACKSkippedTimeWait TCPACKSkippedChallenge TCPWinProbe 
TCPKeepAlive TCPMTUPFail TCPMTUPSuccess
TcpExt: 0 0 0 0 58 0 0 26 0 0 50 0 0 0 0 75402 17 201 0 0 6876848 59804 2258387 
0 33 0 0 3 0 0 0 0 1 102 15 0 0 0 1306 60 386 292 10 0 0 108750 201 1 228 1 8 4 
0 63 0 3 0 0 0 0 107 1 0 0 0 2834 1962 622 0 0 0 0 0 0 0 0 0 1065022 54160 0 1 
3 3 0 0 0 0 0 0 0 475 0 9578 6 8 71 257 5116325 0 0 0 0 0 0 6 0 0 0 61 85 0 0
IpExt: InNoRoutes InTruncatedPkts InMcastPkts OutMcastPkts InBcastPkts 
OutBcastPkts InOctets OutOctets InMcastOctets OutMcastOctets InBcastOctets 
OutBcastOctets InCsumErrors InNoECTPkts InECT1Pkts InECT0Pkts InCEPkts
IpExt: 0 0 0 0 206 0 16405602866 8921427728 0 0 57928 0 0 16121410 0 5388 0

Also, here are the per-socket stats I could find (ss -i after having
reproduced hang):
         reno wscale:7,7 rto:7456 backoff:5 rtt:32.924/1.41 ato:40 mss:1374
         pmtu:1500 rcvmss:1248 advmss:1448 cwnd:1 ssthresh:16
         bytes_acked:32004 bytes_received:4189 segs_out:85 segs_in:54
         data_segs_out:78 data_segs_in:18 send 333.9Kbps lastsnd:3912
         lastrcv:11464 lastack:11387 pacing_rate 21.4Mbps delivery_rate
         3.5Mbps busy:12188ms unacked:33 retrans:1/5 lost:33 rcv_rtt:37
         rcv_space:29200 rcv_ssthresh:39184 notsent:28796 minrtt:24.986


Here are the same stats with tcp timestamp disabled (after running my
reproducer, e.g. outputing a big chunk of text quickly):
         reno wscale:7,7 rto:228 rtt:27.267/1.423 ato:40 mss:1386 pmtu:1500
         rcvmss:1248 advmss:1460 cwnd:10 ssthresh:12 bytes_acked:17311070
         bytes_received:40445 segs_out:13331 segs_in:7523
         data_segs_out:13279 data_segs_in:947 send 4.1Mbps lastsnd:5
         lastrcv:5 lastack:5 pacing_rate 4.9Mbps delivery_rate 431.4Kbps
         app_limited busy:36064ms unacked:1 retrans:0/6 rcv_rtt:9112.95
         rcv_space:29233 rcv_ssthresh:39184 minrtt:25.566

So I guess lost:33 matching unacked:33 might be another hint?


I'll need a bit more time reading the code to understand what this all
implies ; feel free to beat me to it.

Thanks,
-- 
Dominique Martinet | Asmadeus

Reply via email to