420 likes | 535 Views
Peer-Assisted Content Distribution Networks: Techniques and Challenges. Pei Cao Stanford University. Traditional Intra-Provider Content Distribution Networks. National Center. Regional Center. Branch. Users. Peer-to-Peer Content Distribution. National Center. Regional Center. Branch.
E N D
Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University
Traditional Intra-Provider Content Distribution Networks National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .
Peer-to-Peer Content Distribution National Center Regional Center . . . Branch . . . . . . Users . . . . . . . . . . . .
P2P vs CDN • P2P: • No infrastructure cost • Supply grows linearly with demand • Simple distributed, randomized algorithms • No QoS • CDN: • Initial infrastructure cost • Centralized scheduling algorithms • Network efficiency • Capable of supporting QoS
Combine P2P with CDN? • Use P2P to complement CDN • P2P reduces load on the CDN, covers areas where CDN is not installed • Must be able to control, or “shape”, P2P traffic • Use CDN to complement P2P • CDN steps in when peer-based distribution is falling short, enabling QoS • Must be able to detect when peers won’t meet the delivery time guarantee
Outline • Review of BitTorrent • Traffic-shaping BitTorrent: biased neighbor selection • QoS in BitTorrent: delivery time prediction
BitTorrent File Sharing Network Goal: replicate K chunks of data among N nodes • Form neighbor connection graph • Neighbors exchange data
BitTorrent: Neighbor Selection Tracker file.torrent Seed 1 Whole file 4 3 2 5 A
BitTorrent: Piece Replication Tracker file.torrent Seed 1 Whole file 3 5 A
BitTorrent: Piece Replication Algorithms • “Tit-for-tat” (choking/unchoking): • Each peer only uploads to 7 other peers at a time • 6 of these are chosen based on amount of data received from the neighbor in the last 20 seconds • The last one is chosen randomly, with a 75% bias toward newcomers • (Local) Rarest-first replication: • When peer 3 unchokes peer A, A selects which piece to download
Analysis of BitTorrent • Conclusion from modeling studies: BitTorrent is nearly optimal in idealized, homogeneous networks • Demonstrated by simulation studies • Confirmed by theoretical modeling studies • Intuition: in a random graph, Prob(Peer A’s content is a subset of Peer B’s) ≤ 50%
Random Neighbor Graph • Existing studies all assume random neighbor selection • BitTorrent no longer optimal if nodes in the same ISP only connect to each other • Random neighbor selection high cross-ISP traffic
Difficulty in Traffic-Shaping P2P Applications • ISPs: • Different links have different monetary costs • Prefer “clustering” of traffic • P2P Applications: • No knowledge of underlying ISP topology • Use randomized algorithms that don’t do well under clustering • Current solution: throttling users suffer
A Network-Friendly BitTorrent? • ISPs inform BitTorrent of its link preferences • Algorithm of BitTorrent is adjusted such that both users and ISPs benefit • Example: Biased Neighbor Selection • Works when cost function is transitive
Biased Neighbor Selection • Idea: of N neighbors, choose N-k from peers in the same ISP, and choose k randomly from peers outside the ISP ISP
Implementing Biased Neighbor Selection • By Tracker • Need ISP affiliations of peers • Peer to AS maps • Public IP address ranges from ISPs • Special “X-” HTTP header • By traffic shaping devices • Intercept “peer tracker” messages and manipulate responses • No need to change tracker or client
Evaluation Methodology • Event-driven simulator • Use actual client and tracker codes as much as possible • Calculate bandwidth contention, assume perfect fair-share from TCP • Network settings • 14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s download • Seed node, 400Kb/s upload • Optional “university” nodes (1Mb/s upload) • Optional ISP bottleneck to other ISPs
Throttling: Cross-ISP Traffic Redundancy: Average # of times a data chunk enters the ISP
Importance of Rarest-First Replication • Random piece replication performs badly • Increases download time by 84% - 150% • Increase traffic redundancy from 3 to 14 • Biased neighbors + Rarest-First More uniform progress of peers
Presence of External High-Bandwidth Peers • Biased neighbor selection alone: • Average download time same as regular BitTorrent • Cross-ISP traffic increases as # of “university” peers increase • Result of tit-for-tat • Biased neighbor selection + Throttling: • Download time only increases by 12% • Most neighbors do not cross the bottleneck • Traffic redundancy (i.e. cross-ISP traffic) same as the scenario without “university” peers
Comparison with Simple Clustering • Gateway peer: only one peer connects to the peers outside the ISP, all other peers only connect to peers inside the ISP • Gateway peer must have high bandwidth • It is the “seed” for this ISP • Ends up benefiting peers in other ISPs
Combining Biased Neighbor Selection with Caches • Under random neighbor selection • bandwidth requirement of cache is high • Under biased neighbor selection • bandwidth needed from the cache is reduced by an order of magnitude
Conclusions • By choosing neighbors well, BitTorrent can achieve high peer performance without increasing ISP cost • Biased neighbor selection: choose initial set of neighbors well • Can be combined with throttling and caching BitTorrent’s algorithm can be shaped!
Motivation • Provide delivery time guarantee under P2P+CDN • What contributes to delivery time of a download via BitTorrent? • From simulations: seed bandwidth and even replication of blocks • Missing: node join/leave dynamics, TCP effects, etc.
Side-by-Side Live Experiments • Two clients, running on the same machine, starting at the same time, downloading the same • 13 experiments from Apr-May 2006 • File sizes: 700MB ~ 1.4GB • Network size: 1100 ~ 2100 peers • Duration: 10 hrs ~ 2 days
Results from Experiments • Effective download rate: 10 ~ 30KB/s • Speed difference between the two peers: 3% ~ 82% • What made the slower peer slow?
Suspicion #1: Slower Neighbors? • Calculate unweighted average of observed throughput at application level • R1: average from all neighbors • R2: average from neighbors uploading >250KB of data • R3: average from neighbors uploading >2.5MB of data • Low correlation between download-time ratio and neighbor-speed ratio • 0.57 for R1, 0.43 for R2, 0.47 for R3 • Faster neighbors corresponds to slower downloads in 3 experiments
Suspicion #2: Fewer Neighbors Uploading to the Peer? • Slot analysis: calculate download concurrency • Maximum number of neighbors: 35 • Neighbors come and go align neighbors into 35 slots • Calculate time-average of number of concurrent slots with neighbors uploading • Upload concurrency varies from 7 to 11 • Explains one of the download-time/neighbor-speed reversal case • But doesn’t explain the two others
“Close” Neighbors • 90% of data downloaded from 1-4% of neighbors • Let F(p) and G(p) be the number of neighbors that provides p of data to peers F and G, then F(p) > G(p) peer F is slower than G • This holds for p = 90%, 75%, and 50%
What makes a neighbor close? • Not related to speed, or order of connection to peer, or order of unchoking by peer
Cost of Departure of a Close Neighbor • Departure cost: if one close neighbor leaves, calculate the time until the earliest next close neighbor • The average departure cost: 30 min The convergence time of the tit-for-tat algorithm is slow
Why Do Close Neighbors Leave • Five possible reasons • A: Random disconnect • B: Finished downloading • C: Peer broke off the relationship • D: Neighbor broke off the relationship • Results: B is most common, followed by C/D, then A
Conclusions • Content delivery time in BitTorrent is determined by: • Neighbor upload speed • Stability of neighbor relationship • Disruption of the pairing leads to long delivery time • Neighbors may leave due to random disconnection, completion of download, or finding faster neighbors
Using CDN to Complement P2P • Use nodes CDN as high-speed specially managed seeds • Seeds are called to help whenever a node loses a close neighbor
Summary • A way to shape BitTorrent traffic • Predicting BitTorrent performance by monitoring close peer relationship
Related Work • Many modeling studies of BitTorrent • Simulation studies • Measurements of real torrents
Ongoing Work • Live experiments with biased neighbor selections • A k-regular graph algorithm with faster convergence • Prototype implementation of “P2P+CDN”