http://eurosys.org/blog/?p=246

A blog where winners of the EuroSys travel grants (and a few others) can report on conferences they attended

Thursday 2009-08-20: SIGCOMM CONFERENCE: Performance Optimization (Chair: Ratul Mahajan, Microsoft Research)

Session 9: Performance Optimization (Chair: Ratul Mahajan, Microsoft Research)
———————————————————-
Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication
Vijay Vasudevan (Carnegie Mellon University), Amar Phanishayee (Carnegie Mellon University), Hiral Shah (Carnegie Mellon University), Elie Krevat (Carnegie Mellon University), David Andersen (Carnegie Mellon University), Greg Ganger (Carnegie Mellon University), Garth Gibson (Carnegie Mellon University and Panasas, Inc), Brian Mueller (Panasas, Inc.)
———————————————————-
TCP has a problem in data centers: the dropped packet takes 200ms to be retransmitted

There are some apps that can not tolerate that

solution: enable ms retransmission
improve throughout/latency in datacenter
safe for wide-area

10-100 microsecond, 1-10Gbps
under heavy load, pkt loss is common

1 TCP timout is 1000s times more than RTT

The scenario involves the client sending a single request packet once in a while. This is in contrary of TCP design principles: full window of packets. Hence, the fast-retransmission does not get triggered in case of pkt loss

Solution:
1) eliminate long 200ms timeout
2) TCP must track RTT in microseconds

Interaction with delayed ACK
- The reduction is not so much
Stability? Causing congestion collapse?
- Today’s TCP has mechanisms to cope with that

Q: problem for congestion control?
A: exponential backup takes care of that

Tags: Sigcomm09

Comments are closed.


Reply via email to