1 / 27

Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds

Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds. Sahan Gamage , Ardalan Kangarlou, Ramana Kompella, Dongyan Xu. Motivation. VM 3. VM 1. VM 2. VM 4. Our recent investigations : VM consolidation negatively impacts network performance.

koko
Download Presentation

Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds Sahan Gamage, Ardalan Kangarlou, Ramana Kompella, Dongyan Xu

  2. Motivation VM 3 VM 1 VM 2 VM 4 • Our recent investigations : VM consolidation negatively impacts network performance Virtualization Layer Hardware • VM Consolidation: A common practice • Enables slicing of physical resources among VMs • Multiple VMs sharing the same core • Flexibility, scalability, and economy

  3. Effect of VM Consolidation on TCP Sender Scheduled VM Shared buffer Data Data Data • High RTT affects • TCP throughput ACK ACK RTT Scheduling Delay ACK Data Data Shared Buffer Driver Domain TCP Receiver Data VM2 VM1 VM1 VM3 Time

  4. Previous Work on Receive Path Scheduled VM Driver Domain Sender Shared Buffer VM2 Data VM3 Scheduling Delay RTT VM1 • Same solution does not apply to transmit path • Driver domain cannot generate data packets ACK • ACK generation is delayed due to VM scheduling • vSnoop [SC’10]: Acknowledgement offloading • In-order data packets are ACK’ed by driver domain • Faster progress of the connection

  5. Our Solution: vFlood • Key idea: Offload congestion control to driver domain • Alleviates the negative effect of VM scheduling on the transmit path of TCP • High-level steps: • Set VM TCP’s congestion window to a large value • Perform congestion control at the driver domain • Modifications to the VM are minimal • Principles applicable to other VMMs • Xen, VMware etc.

  6. Congestion Control Offload • vFlood Scheduled VM Shared Buffer TCP Receiver Driver Domain Shared buffer Data Data Data Data • vFlood ACK ACK Better Network Utilization Data Time VM3 VM2 VM1 VM1

  7. vFlood’s Impact on TCP Flows • TCP slow start phase • Helps connections progress faster • Significant benefit for short transfers • TCP congestion avoidance phase • Large flows can also benefit from vFlood • Benefit not as much as for slow start

  8. vFlood Design • Allows the VM to adopt an aggressive congestion control strategy VM VM TCP Stack vFlood VM Module Driver Domain vFlood Channel vFlood Driver Domain Module Congestion Control State Machine Buffer Management • Enables communication between VM module and driver domain module NIC • Implements standards compliant congestion control strategy • Manage buffer space

  9. Design Challenges • Challenge 1: Choice of congestion control algorithm • Solution: • Reuse driver domain’s congestion control algorithms • Allow per-flow congestion control algorithms • Challenge 2: Handling connections with packet loss • Solution: • Stop congestion control offload when a packet loss is detected • Let the VM take over the control of the connection • Resume congestion control offload when VM signals recovery

  10. Bigger Challenge: Buffer Management Driver domain buffer VM1 VM2 Network VM3 • How to use the driver domain buffer effectively ? • Driver domain memory is a limited resource • Single flow may occupy the entire buffer space

  11. Buffer Management (Cont.) Driver domain buffer VM1 VM2 Network VM3 • Solution: Two levels of isolation • Inter-VM: Static buffer allocation • Ensures one VM can not occupy all buffer space • Intra-VM: Per-flow dynamic threshold scheme • Buffer space is fairly allocated for active flows

  12. Dynamic Threshold Scheme • Static provisioning not as useful • Observation: All flows do not benefit equally • Not much benefit to High RTT/low bandwidth flows • Dynamic threshold scheme for sharing buffer • Idea due to Chaudhary and Hahne[ToN’98] • Threshold T = α(B-Q(t)) • α varied according to priority • B – Total buffer size • Q(t) – Occupancy at time t • Some space always remains for a new flow

  13. State Machine Per-Flow ACKs Trigger Packet Transmissions SYN START ACTIVE Buffer Space Available VM Recovers & Buffer Space Available No Buffer Packet Loss Detection Packet Loss Detection NO BUFFER PACKET LOSS No Buffer

  14. Implementation in Xen Driver Domain VM vFlood Channel vFlood dom0 vFlood VM Linux TCP Stack Bridge IO Channel Approx. 1500 LOC 40% reused netback netfront NIC driver Xen VMM NIC

  15. Evaluation – Setup • VM hosts • 3.06GHz Intel Xeon CPUs, 4GB RAM • Only one core/CPU enabled • Xen 3.3 with Linux 2.6.18 for the driver domain (dom0) and the guest VMs • Client machine • 2.4GHz Intel Core 2 Quad CPU, 2GB RAM • Linux 2.6.19 • Gigabit Ethernet switch

  16. TCP Throughput: 2 VMs/Core 14x 12x 10x Vanilla Xen 8x Throughput Improvement Xen+vFlood 6x 4x 2x 0x 100K 250K 500K 1M Flow Size [Bytes]

  17. TCP Throughput: 3 VMs/Core 14x 12x Vanilla Xen Xen+vFlood 10x 8x Throughput Improvement 6x 4x 2x 0x 100K 250K 500K 1M Flow Size [Bytes]

  18. TCP Throughput: 4 VMs/Core 14x 12x Vanilla Xen Xen+vFlood 10x 8x Throughput Improvement 6x 4x 2x 0x 100K 250K 500K 1M Flow Size [Bytes]

  19. TCP Throughput: 5 VMs/Core 14x 12x Vanilla Xen Xen+vFlood 10x 8x Throughput Improvement 6x 4x 2x 0x 100K 250K 500K 1M Flow Size [Bytes]

  20. Apache Olio Benchmark : Test Setup Faban Client Emulator VM1 Memcached VM2 VM1 MySQL VM2 VM1 NFS VM2 VM2 VM1 Apache httpd vFlood dom0 Server 1 vFlood vFlood vFlood dom0 dom0 dom0 Server 3 Server 2 Server 4

  21. Apache Olio Results

  22. Other Results • Overhead • < 5% CPU overhead from vFlood routines • Effectiveness of buffer management schemes • Fair allocation scheme (α = 1): • Throughput improvements up to 32% for low RTT flows • Prioritized allocation scheme (high α for low RTT flows) • Throughput improvements up to 64% • Throughput of high RTT flows is not affected

  23. Summary and Conclusions • Problem:VM consolidation affects TCP performance • Solution: vFlood • On transmit path • Key idea: Congestion control offloaded • Is transparent to applications • Results: • Raw TCP improvements up to 10X • Apache Olio benchmark performance by ~ 33% • Low overhead (< 5% CPU utilization)

  24. Thank you Questions ?

  25. Related Work • Optimizing virtualized I/O path • Menon et al. [USENIX ATC’06,’08; ASPLOS’09] • Improving intra-host VM communications • XenSocket [Middleware’07], XenLoop [HPDC’08], Fido [USENIX ATC’09], XWAY [VEE’08], IVC [SC’07] • I/O-aware VM scheduling • Govindan et al. [VEE’07], DVT [SoCC’10] • TCP Offloading/Onloading • Chelsio, Alachritech,etc.

  26. vFlood Overhead • Per-packet CPU overhead for vFlood routines in dom0 and within VM Profiling per-packet overhead using Xenoprof [Menon VEE’05]

  27. Buffer Management Policies • A TCP sender in a VM sending data to two receivers • One resides in the same datacenter (Low RTT) • Other in a remote node : planetlab1.ucsd.edu (High RTT) • 20 concurrent flows • Vary the proportion of High RTT/Low RTT • Measure average throughput 12 10 NoPolicy-Low RTT 8 NoPolicy-High RTT Fair-Low RTT Throughput (Mbps) 6 Fair-High RTT 4 Prioritized-Low RTT Prioritized-High RTT 2 0 30/70 50/50 70/30 Flow Mix(Low RTT/High RTT Percentage)

More Related