780 likes | 799 Views
Understand congestion control challenges in high bandwidth-delay product networks and how to improve performance through decoupling congestion control from fairness. Learn about XCP, a scalable congestion control protocol, and its explicit control mechanism.
E N D
CMSC 34702ML for End-to-End Adaptation(Congestion Control) Junchen Jiang October 17, 2019
Congestion Control for High Bandwidth-Delay Product NetworksTCP ex Machina: Computer-Generated Congestion Control
Background (Based on the lecture from https://www.youtube.com/watch?v=HxOA8HrVzPg)
Layered view of the Internet Reliability Timer Sliding window Flow control Sharing (Congestion control)
What is congestion control? (Ideal) objective: 100Mbps 100Mbps 100Mbps 10Mbps 100Mbps Li C 100Mbps 100Mbps
Why is it hard? To scale out, network must keep no per-flow state (Ideal) objective: 100Mbps 100Mbps 100Mbps 10Mbps 100Mbps Li C 100Mbps 100Mbps
Buffering Congestion collapse Throughput C Sum of load ()
Goals of congestion control Objective: Avoid congestion collapse Reasonably high link utilization Fairness Dynamic N (# of competing flows) Wide range of C (link capacity) Delayed feedback Under
Congestion control framework Packet Receiver Sender Ack State: Estimation of RTT (round-trip time) Congestion window (# of outstanding packets) Decision: When to send out the next packet?
Example: AIMD Packet Receiver Sender Ack State: EWMA RTT: Estimation of RTT (round-trip time) Cwnd: Congestion window (# of outstanding packets) Decision: When to send out the next packet? AIMD logic: On Ack: Cwnd += 1; update ewma_rtt On packet loss/timeout: Cwnd /= 2; update ewma_rtt
Congestion Control for High Bandwidth-Delay Product Networks (sigcomm’02) First systematic approach to congestion control under high bw-delay product networks which were to dominate in the next decade. (Based on slides from R. Stallings, M. Handley and D. Katabi)
TCP congestion control performs poorly as bandwidth or delay increases Shown analytically in [Low01] and via simulations Avg. TCP Utilization Avg. TCP Utilization 50 flows in both directions Buffer = BW x Delay RTT = 80 ms 50 flows in both directions Buffer = BW x Delay BW = 155 Mb/s • Because TCP lacks fast response • Spare bandwidth is available TCP increases • by 1 pkt/RTT even if spare bandwidth is huge • When a TCP starts, it increases exponentially • Too many drops Flows ramp up by 1 pkt/RTT, • taking forever to grab the large bandwidth Bottleneck Bandwidth (Mb/s) Round Trip Delay (sec)
Key reason: Coupled congestion control and faireness A single mechanism controls both Example: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls both Solution:Decouple Congestion Control from Fairness High Utilization; Small Queues; Few Drops Bandwidth Allocation Policy
Key reason: Coupled congestion control and faireness A single mechanism controls both Example: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls both How does decoupling solve the problem? • To control congestion: use MIMD which shows fast response • To control fairness: use AIMD which converges to fairness Solution:Decouple Congestion Control from Fairness
Nice properties of XCP • Improved Congestion Control: • Small queues • Almost no drops • Improved Fairness • Scalable (no per-flow state)
XCP: An eXplicitControl Protocol • Congestion Controller • Fairness Controller
Round Trip Time Round Trip Time Congestion Window Congestion Window Feedback Feedback Congestion Header How does XCP Work? Feedback = + 0.1 packet
Round Trip Time Congestion Window Feedback = + 0.1 packet How does XCP Work? Feedback = - 0.3 packet
How does XCP Work? Congestion Window = Congestion Window + Feedback XCP extends ECN and CSFQ Routers compute feedback without any per-flow state
Congestion Controller Fairness Controller Goal: Matches input traffic to link capacity & drains the queue Goal: Divides between flows to converge to fairness Looks at a flow’s state in Congestion Header Looks at aggregate traffic & queue AIMD MIMD • Algorithm: • Aggregate traffic changes by ~ Spare Bandwidth • ~ - Queue Size So, = davg Spare - Queue Algorithm: If > 0 Divide equally between flows If < 0 Divide between flows proportionally to their current rates How Does an XCP Router Compute the Feedback? Congestion Controller Fairness Controller
Algorithm: If > 0 Divide equally between flows If < 0 Divide between flows proportionally to their current rates = davg Spare - Queue Theorem:System converges to optimal utilization (i.e., stable) for any link bandwidth, delay, number of sources if: Need to estimate number of flows N RTTpkt : Round Trip Time in header Cwndpkt : Congestion Window in header T: Counting Interval (Proof based on Nyquist Criterion) Getting the devil out of the details … Congestion Controller Fairness Controller No Per-Flow State No Parameter Tuning
and chosen to make XCP robust to delay XCP increases proportionally to spare bandwidth XCP Remains Efficient as Bandwidth or Delay Increases Utilization as a function of Bandwidth Utilization as a function of Delay Avg. Utilization Avg. Utilization Bottleneck Bandwidth (Mb/s) Round Trip Delay (sec)
Heavily depends on operational environments • 1980s-1990s: Local-Area Networks • Reno, Tahoe, NewReno • 1990s-2000s: Diverse environments • High bandwidth-delay product (wireless, satellite, high capacity WAN) • XCP, Vegas, Cubic, Compound TCP, … • 2010s: Datacenter • Micro-second RTT, 10s GB bandwidth, delay-sensitive apps, Hardware support • DCTCP, Deadline-driven TCP, …
The march of congestion control mechanisms Why so many designs? Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf
Goals of congestion control Objective: Avoid congestion collapse Reasonably high link utilization Fairness Dynamic N Wide range of C Delayed feedback Problem formulations are too vague! Under
Rational choice of scheme is challenging • Different goals? • Different assumptions about network? • One scheme just plain better? VS.
Networks constrained by a fuzzy idea of TCP’s assumptions • Mask stochastic loss • Bufferbloat • Mask out-of-order delivery • No parallel/multipath routing Advice for Internet Subnetwork Designers (RFC 3819) is 21,000 words! Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf
Design rationale? (from XCP paper) Congestion is not a binary variable, so congestion signaling should reflect the degree of congestion. … This [XCP] allows the senders to decrease their sending windows quickly when the bottleneck is highly congested, … . The resulting protocol is both more responsive and less oscillatory. A fundamental characteristic of such a system is that it becomes unstable for some large feedback delay. …In the context of congestion control, this means that as delay increases, the sources should change their sending rates more slowly. Fuzzy and qualitative (handwavy) design constraints
TCP ex Machina: Computer-Generated Congestion Control (SIGCOMM’13) First paper to demonstrate machine-generated logic can defeat handcrafted congestion control (in some cases) Based on slides from Keith Winstein: http://conferences.sigcomm.org/sigcomm/2013/slides/sigcomm/12.pdf
If congestion control is the answer, what’s the question?
If congestion control is the answer, what’s the question? Are there better answers?
Free the network to evolve Transport layer should adapt to whatever: • network does • application wants
A more precise formulation of the congestion control problem • Control knobs • Objectives • Environment (Network model & Traffic model)
The knobs of congestion control * Superrational congestion control: Assuming every node is running the same algorithm
Remy: Computer-generated congestion control Control knobs RemyCC: Remy-generated Congestion control Remy Performance objectives Network model Traffic model
RemyCC workflow Congestion control logic Feedback (acks, timeout, …) When to send the next packet
RemyCC workflow Feedback (acks, timeout, …) When to send the next packet Congestion signals MIMD Parameters
RemyCC workflow Feedback (acks, timeout, …) When to send the next packet Congestion signals MIMD Parameters
Remy’s Job Find piecewise-continuous Rule() that optimizes expected value of objective function.
2-D Search Space Objective: Maximizing