Linux TCP currently uses the initial congestion window of 1 packet if multiple SYN or SYNACK timeouts per RFC6298. However such timeouts are often spurious on wireless or cellular networks that experience high delay variances (e.g. ramping up dormant radios or local link retransmission). Another case is when the underlying path is longer than the default SYN timeout (e.g. 1 second). In these cases starting the transfer with a minimal congestion window is detrimental to the performance for short flows.
One naive approach is to simply ignore SYN or SYNACK timeouts and always use a larger or default initial window. This approach however risks pouring gas to the fire when the network is already highly congested. This is particularly true in data center where application could start thousands to millions of connections over a single or multiple hosts resulting in high SYN drops (e.g. incast). This patch-set detects spurious SYN and SYNACK timeouts upon completing the handshake via the widely-supported TCP timestamp options. Upon such events the sender reverts to the default initial window to start the data transfer so it gets best of both worlds. This patch-set supports this feature for both active and passive as well as Fast Open or regular connections. Yuchung Cheng (8): tcp: avoid unconditional congestion window undo on SYN retransmit tcp: undo initial congestion window on false SYN timeout tcp: better SYNACK sent timestamp tcp: undo init congestion window on false SYNACK timeout tcp: lower congestion window on Fast Open SYNACK timeout tcp: undo cwnd on Fast Open spurious SYNACK retransmit tcp: refactor to consolidate TFO passive open code tcp: refactor setting the initial congestion window net/ipv4/tcp.c | 12 ----- net/ipv4/tcp_input.c | 99 +++++++++++++++++++++++++++++----------- net/ipv4/tcp_metrics.c | 10 ---- net/ipv4/tcp_minisocks.c | 5 ++ net/ipv4/tcp_output.c | 4 ++ net/ipv4/tcp_timer.c | 3 ++ 6 files changed, 84 insertions(+), 49 deletions(-) -- 2.21.0.593.g511ec345e18-goog