1 / 42

Peer-Assisted Content Distribution Networks: Techniques and Challenges

Peer-Assisted Content Distribution Networks: Techniques and Challenges. Pei Cao Stanford University. Traditional Intra-Provider Content Distribution Networks. National Center. Regional Center. Branch. Users. Peer-to-Peer Content Distribution. National Center. Regional Center. Branch.

Download Presentation

Peer-Assisted Content Distribution Networks: Techniques and Challenges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University

  2. Traditional Intra-Provider Content Distribution Networks National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .

  3. Peer-to-Peer Content Distribution National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .

  4. P2P vs CDN • P2P: • No infrastructure cost • Supply grows linearly with demand • Simple distributed, randomized algorithms • No QoS • CDN: • Initial infrastructure cost • Centralized scheduling algorithms • Network efficiency • Capable of supporting QoS

  5. Combine P2P with CDN? • Use P2P to complement CDN • P2P reduces load on the CDN, covers areas where CDN is not installed • Must be able to control, or “shape”, P2P traffic • Use CDN to complement P2P • CDN steps in when peer-based distribution is falling short, enabling QoS • Must be able to detect when peers won’t meet the delivery time guarantee

  6. Outline • Review of BitTorrent • Traffic-shaping BitTorrent: biased neighbor selection • QoS in BitTorrent: delivery time prediction

  7. BitTorrent File Sharing Network Goal: replicate K chunks of data among N nodes • Form neighbor connection graph • Neighbors exchange data

  8. BitTorrent: Neighbor Selection Tracker file.torrent Seed 1 Whole file 4 3 2 5 A

  9. BitTorrent: Piece Replication Tracker file.torrent Seed 1 Whole file 3 5 A

  10. BitTorrent: Piece Replication Algorithms • “Tit-for-tat” (choking/unchoking): • Each peer only uploads to 7 other peers at a time • 6 of these are chosen based on amount of data received from the neighbor in the last 20 seconds • The last one is chosen randomly, with a 75% bias toward newcomers • (Local) Rarest-first replication: • When peer 3 unchokes peer A, A selects which piece to download

  11. Analysis of BitTorrent • Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks • Demonstrated by simulation studies • Confirmed by theoretical modeling studies • Intuition: in a random graph, Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%

  12. Traffic-Shaping BitTorrent

  13. Random Neighbor Graph • Existing studies all assume random neighbor selection • BitTorrent no longer optimal if nodes in the same ISP only connect to each other • Random neighbor selection  high cross-ISP traffic

  14. Difficulty in Traffic-Shaping P2P Applications • ISPs: • Different links have different monetary costs • Prefer “clustering” of traffic • P2P Applications: • No knowledge of underlying ISP topology • Use randomized algorithms that don’t do well under clustering • Current solution: throttling  users suffer

  15. A Network-Friendly BitTorrent? • ISPs inform BitTorrent of its link preferences • Algorithm of BitTorrent is adjusted such that both users and ISPs benefit • Example: Biased Neighbor Selection • Works when cost function is transitive

  16. Biased Neighbor Selection • Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP ISP

  17. Implementing Biased Neighbor Selection • By Tracker • Need ISP affiliations of peers • Peer to AS maps • Public IP address ranges from ISPs • Special “X-” HTTP header • By traffic shaping devices • Intercept “peer  tracker” messages and manipulate responses • No need to change tracker or client

  18. Evaluation Methodology • Event-driven simulator • Use actual client and tracker codes as much as possible • Calculate bandwidth contention, assume perfect fair-share from TCP • Network settings • 14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s download • Seed node, 400Kb/s upload • Optional “university” nodes (1Mb/s upload) • Optional ISP bottleneck to other ISPs

  19. Limitation of Throttling

  20. Throttling: Cross-ISP Traffic Redundancy: Average # of times a data chunk enters the ISP

  21. Biased Neighbor Selection: Download Times

  22. Biased Neighbor Selection: Cross-ISP Traffic

  23. Importance of Rarest-First Replication • Random piece replication performs badly • Increases download time by 84% - 150% • Increase traffic redundancy from 3 to 14 • Biased neighbors + Rarest-First  More uniform progress of peers

  24. Presence of External High-Bandwidth Peers • Biased neighbor selection alone: • Average download time same as regular BitTorrent • Cross-ISP traffic increases as # of “university” peers increase • Result of tit-for-tat • Biased neighbor selection + Throttling: • Download time only increases by 12% • Most neighbors do not cross the bottleneck • Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peers

  25. Comparison with Simple Clustering • Gateway peer: only one peer connects to the peers outside the ISP, all other peers only connect to peers inside the ISP • Gateway peer must have high bandwidth • It is the “seed” for this ISP • Ends up benefiting peers in other ISPs

  26. Combining Biased Neighbor Selection with Caches • Under random neighbor selection • bandwidth requirement of cache is high • Under biased neighbor selection • bandwidth needed from the cache is reduced by an order of magnitude

  27. Conclusions • By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost • Biased neighbor selection: choose initial set of neighbors well • Can be combined with throttling and caching  BitTorrent’s algorithm can be shaped!

  28. Delivery Time Prediction

  29. Motivation • Provide delivery time guarantee under P2P+CDN • What contributes to delivery time of a download via BitTorrent? • From simulations: seed bandwidth and even replication of blocks • Missing: node join/leave dynamics, TCP effects, etc.

  30. Side-by-Side Live Experiments • Two clients, running on the same machine, starting at the same time, downloading the same • 13 experiments from Apr-May 2006 • File sizes: 700MB ~ 1.4GB • Network size: 1100 ~ 2100 peers • Duration: 10 hrs ~ 2 days

  31. Results from Experiments • Effective download rate: 10 ~ 30KB/s • Speed difference between the two peers: 3% ~ 82% • What made the slower peer slow?

  32. Suspicion #1: Slower Neighbors? • Calculate unweighted average of observed throughput at application level • R1: average from all neighbors • R2: average from neighbors uploading >250KB of data • R3: average from neighbors uploading >2.5MB of data • Low correlation between download-time ratio and neighbor-speed ratio • 0.57 for R1, 0.43 for R2, 0.47 for R3 • Faster neighbors corresponds to slower downloads in 3 experiments

  33. Suspicion #2: Fewer Neighbors Uploading to the Peer? • Slot analysis: calculate download concurrency • Maximum number of neighbors: 35 • Neighbors come and go  align neighbors into 35 slots • Calculate time-average of number of concurrent slots with neighbors uploading • Upload concurrency varies from 7 to 11 • Explains one of the download-time/neighbor-speed reversal case • But doesn’t explain the two others

  34. “Close” Neighbors • 90% of data downloaded from 1-4% of neighbors • Let F(p) and G(p) be the number of neighbors that provides p of data to peers F and G, then F(p) > G(p)  peer F is slower than G • This holds for p = 90%, 75%, and 50%

  35. What makes a neighbor close? • Not related to speed, or order of connection to peer, or order of unchoking by peer

  36. Cost of Departure of a Close Neighbor • Departure cost: if one close neighbor leaves, calculate the time until the earliest next close neighbor • The average departure cost: 30 min  The convergence time of the tit-for-tat algorithm is slow

  37. Why Do Close Neighbors Leave • Five possible reasons • A: Random disconnect • B: Finished downloading • C: Peer broke off the relationship • D: Neighbor broke off the relationship • Results: B is most common, followed by C/D, then A

  38. Conclusions • Content delivery time in BitTorrent is determined by: • Neighbor upload speed • Stability of neighbor relationship • Disruption of the pairing leads to long delivery time • Neighbors may leave due to random disconnection, completion of download, or finding faster neighbors

  39. Using CDN to Complement P2P • Use nodes CDN as high-speed specially managed seeds • Seeds are called to help whenever a node loses a close neighbor

  40. Summary • A way to shape BitTorrent traffic • Predicting BitTorrent performance by monitoring close peer relationship

  41. Related Work • Many modeling studies of BitTorrent • Simulation studies • Measurements of real torrents

  42. Ongoing Work • Live experiments with biased neighbor selections • A k-regular graph algorithm with faster convergence • Prototype implementation of “P2P+CDN”

More Related