260 likes | 573 Views
FAST TCP in Linux. Cheng Jin David Wei http://netlab.caltech.edu/FAST/WANinLab/nsfvisit. Outline. Overview of FAST TCP. Implementation Details. SC2002 Experiment Results. FAST Evaluation and WAN-in-Lab. Throughput (Mbps). Bmps Peta. Transfer (GB). Flows. Linux TCP txqlen=100. 1.
E N D
FAST TCP in Linux Cheng Jin David Wei http://netlab.caltech.edu/FAST/WANinLab/nsfvisit
Outline • Overview of FAST TCP. • Implementation Details. • SC2002 Experiment Results. • FAST Evaluation and WAN-in-Lab. netlab.caltech.edu
Throughput (Mbps) Bmps Peta Transfer (GB) Flows Linux TCP txqlen=100 1 1.86 185 78 Linux TCP txqlen=10000 1 2.67 266 111 FAST 19.11.2002 1 9.28 925 387 Linux TCP txqlen=100 2 3.18 317 133 Linux TCP txqlen=10000 2 9.35 931 390 FAST 19.11.2002 2 18.03 1,797 753 FAST vs. Linux TCP Distance = 10,037 km; Delay = 180 ms; MTU = 1500 B; Duration: 3600 s Linux TCP Experiments: Jan 28-29, 2003 netlab.caltech.edu
Aggregate Throughput 92% FAST • Standard MTU • Utilization averaged over 1hr 2G 48% Average utilization 95% 1G 27% 16% 19% txq=100 txq=10000 Linux TCP Linux TCP FAST Linux TCP Linux TCP FAST netlab.caltech.edu
Summary of Changes • RTT estimation: fine-grain timer. • Fast convergence to equilibrium. • Delay monitoring in equilibrium. • Pacing: reducing burstiness. netlab.caltech.edu
FAST TCP Flow Chart Fast Convergence Slow Start Normal Recovery Equilibrium Time-out Loss Recovery netlab.caltech.edu
RTT Estimation • Measure queueing delay. • Kernel timestamp with s resolution. • Use SACK to increase the number of RTT samples during recovery. • Exponential averaging of RTT samples to increase robustness. netlab.caltech.edu
Fast Convergence • Rapidly increase or decrease cwnd toward equilibrium. • Monitor the per-ack queueing delay to avoid overshoot. netlab.caltech.edu
Equilibrium • Vegas-like cwnd adjustment in large time-scale -- per RTT. • Small step-size to maintain stability in equilibrium. • Per-ack delay monitoring to enable timely detection of changes in equilibrium. netlab.caltech.edu
Pacing • What do we pace? • Increment to cwnd. • Time-Driven vs. event-driven. • Trade-off between complexity and performance. • Timer resolution is important. netlab.caltech.edu
Time-Based Pacing data ack data cwnd increments are scheduled at fixed intervals. netlab.caltech.edu
Event-Based Pacing Detect sufficiently large gap between consecutive bursts and delay cwnd increment until the end of each such burst. netlab.caltech.edu
Rf (s) TCP x AQM p Rb’(s) Theory Experiment Internet: distributed feedback system Geneva 7000km Sunnyvale Baltimore 3000km 1000km Chicago SCinetCaltech-SLAC experiments SC2002 Baltimore, Nov 2002 Highlights • FAST TCP • Standard MTU • Peak window = 14,255 pkts • Throughput averaged over > 1hr • 925 Mbps single flow/GE card • 9.28 petabit-meter/sec • 1.89 times LSR • 8.6 Gbps with 10 flows • 34.0 petabit-meter/sec • 6.32 times LSR • 21TB in 6 hours with 10 flows • Implementation • Sender-side modification • Delay based 10 9 Geneva-Sunnyvale 7 #flows FAST 2 Baltimore-Sunnyvale 1 2 1 I2 LSR netlab.caltech.edu/FAST C. Jin, D. Wei, S. Low FAST Team and Partners
Network (Sylvain Ravot, caltech/CERN) netlab.caltech.edu
FAST BMPS 10 #flows 9 Geneva-Sunnyvale 7 FAST 2 Baltimore-Sunnyvale 1 FAST • Standard MTU • Throughput averaged over > 1hr Internet2 Land Speed Record 1 2 netlab.caltech.edu
Aggregate Throughput 88% FAST • Standard MTU • Utilization averaged over > 1hr 90% 90% Average utilization 92% 95% 1.1hr 6hr 6hr 1hr 1hr 1 flow 2 flows 7 flows 9 flows 10 flows netlab.caltech.edu
Caltech-SLAC Entry Rapid recovery after possible hardware glitch Power glitch Reboot 100-200Mbps ACK traffic netlab.caltech.edu
SCinetCaltech-SLAC experiments SC2002 Baltimore, Nov 2002 Acknowledgments netlab.caltech.edu/FAST • Prototype • C. Jin, D. Wei • Theory • D. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA) • Experiment/facilities • Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S. Ravot (Caltech/CERN), S. Singh • CERN: O. Martin, P. Moroni • Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip • DataTAG: E. Martelli, J. P. Martin-Flatin • Internet2: G. Almes, S. Corbato • Level(3): P. Fernes, R. Struble • SCinet: G. Goddard, J. Patton • SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J. Navratil, J. Williams • StarLight: T. deFanti, L. Winkler • TeraGrid: L. Winkler • Major sponsors • ARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF
Evaluating FAST • End-to-End monitoring doesn’t tell the whole story. • Existing network emulation (dummynet) is not always enough. • Better optimization if we can look inside and understand the real network. netlab.caltech.edu
Dummynet and Real Testbed netlab.caltech.edu
Dummynet Issues • Not running on a real-time OS -- imprecise timing. • Lack of priority scheduling of dummynet events. • Bandwidth fluctuates significantly with workload. • Much work needed to customize dummynet for protocol testing. netlab.caltech.edu
10 GbE Experiment • Long-distance testing of Intel 10GbE cards. • Sylvain Ravot (Caltech) achieved 2.3 Gbps using single stream with jumbo frame and stock Linux TCP. • Tested HSTCP, Scalable TCP, FAST, and stock TCP under Linux. 1500B MTU: 1.3 Gbps SNV -> CHI; 9000B MTU: 2.3 Gbps SNV -> GVA netlab.caltech.edu
TCP Loss Mystery • Frequent packet loss with 1500-byte MTU. None with larger MTUs. • Packet loss even when cwnd is capped at 300 - 500 packets. • Routers have large queue size of 4000 packets. • Packets captured at both sender and receiver using tcpdump. netlab.caltech.edu
How Did the Loss Happen? loss detected netlab.caltech.edu
How Can WAN-in-Lab Help? • We will know exactly where packets are lost. • We will also know the sequence of events (packet arrivals) that lead to loss. • We can either fix the problem in the network if any, or improve the protocol. netlab.caltech.edu
Conclusion • FAST improves the end-to-end performance of TCP. • Many issues are still to be understood and resolved. • WAN-in-Lab can help make FAST a better protocol. netlab.caltech.edu