1 / 43

Measurements, Analysis, and Modeling of BitTorrent-like Systems

Measurements, Analysis, and Modeling of BitTorrent-like Systems. Lei Guo 1 , Songqing Chen 2 , Zhen Xiao 3 , Enhua Tan 1 , Xiaoning Ding 1 , and Xiaodong Zhang 1 1 College of William and Mary 2 George Mason University, 3 AT & T Labs - Research.

kelton
Download Presentation

Measurements, Analysis, and Modeling of BitTorrent-like Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo1, Songqing Chen2, Zhen Xiao3, Enhua Tan1, Xiaoning Ding1, and Xiaodong Zhang1 1College of William and Mary 2George Mason University, 3AT & T Labs - Research

  2. Peers sharing different files self-organize into a P2P network Exchange files they desire Limitations Free riding Large file downloading ♫ Basic Model of P2P Systems Examples: Gnutella, KaZaa, eDonkey/eMule/Overnet

  3. 4 5 ... BitTorrent: Fast Delivery with Incentive • A large file is divided into chunks • Peers interested in the same file self-organize into a torrent • Peers exchange file chunks with each other • Incentive is established by tit for tat • Very simple and effective, scale fairly well during flash crowd Torrent of Bits

  4. BitTorrent Traffic • Online users • 6.8 million in August 2004, 9.6 million in August 2005 (BigChampagne) • Traffic volume • 53% of all P2P traffic on the Internet in June 2004 (CacheLogic) P2P traffic: 60-80% Other traffic: 20-30% Source: CacheLogic, 2004

  5. Limited Understanding of BitTorrent • Existing studies on BitTorrent systems (INFOCOM04, SIGCOMM04) • Unrealistic assumptions in system model: no evolution considered • Single-torrent based: more than 85% BT users join multiple torrents • What we are not clear about BitTorrent systems • Service availability • Service stability • Service fairness • Our objective of this work • Evolution of single-torrent system, and limitations of BT • Multi-torrent model for inter-torrent relation and collaboration during the entire lifetime

  6. Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Conclusion

  7. seed foo.torrent ... 3 4 5 I am here! foo.torrent Tracker site Web site How BitTorrent Works: Publishing The publisher • Create a meta file • Publish on a Web site • Start the tracker site • Start a BT client as the initial seed announce: tracker URL for bootstrap creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk peer list

  8. seed ... 3 4 5 foo.torrent foo.torrent Tracker site Web site peer list download I am here! peer list How BitTorrent Works: Downloading The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds)

  9. seed ... 3 4 5 foo.torrent foo.torrent foo.torrent Tracker site Web site peer list How BitTorrent Works: Downloading The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds) • Exchange file chunk with other peers • Download complete: become a new seed

  10. seed ... ... 3 3 4 4 5 5 foo.torrent foo.torrent foo.torrent Tracker site Web site How BitTorrent Works: Downloading Future performance Depends on the arrival and departure of new downloaders and seeds The downloader • Download the meta file • Start a BT client, connect to the tracker site • Get peer list from tracker • Get first chunk from other peers (seeds) • Exchange file chunk with other peers • Download complete: become a new seed • Initial seed leaves peer list seed

  11. Our Methodology of this Study • Measurement • BitTorrent traffic pattern • Meta file downloading and tracker statistics • Analysis • BitTorrent user behavior and performance limitations • Curve fitting, parameter estimation and validation of mathematical models • Modeling • Torrent evolution and inter-torrent relation • Fluid model, probability model, and graph model

  12. announce: tracker URL creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk foo.torrent Meta File Downloading • The first HTTP packets of .torrent file downloading • Cable network: 3,000+ downloads, 1,000+ torrent meta files • Server farm: 50 tracker sites host hundreds of torrents • Gigasope: fast Internet traffic monitoring tool by AT&T • What information it contains? • Torrent birth time • Peer arrival time to the torrent (packet capture time of downloading) • About 10 days

  13. Torrent Statistics on Trackers • Professional/dedicated tracker sites • Each may host thousands of torrents at the same time • http://www.alluvion.org/ and http://www.crapness.com/, collected by University of Massachusetts, Amherst • Ex: alluvion -- 1,500 torrents, 550 are fully traced • What information it contains? • Torrents: torrent birth time, file size, number of peers/seeds • Peers: request time, downloading/uploading bytes, downloading/uploading bandwidth • Sampled every 0.5 hour for 48 days

  14. Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • The evolution of torrent over time • Limitations of current BitTorrent systems • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Conclusion

  15. meta file workload tracker site workload individual torrents CCDF of peer arrival 103 ------ raw data ------ linear fit ------ raw data ------ linear fit 30 104 102 20 relative deviation (%) 102 101 10 100 100 0 0 100 200 0 20 40 100 300 500 torrents ranked by population (non ascending order) Peer arrivals: decrease with time exponentially Peer arrival rate Torrent Popularity 6% in average time after torrent birth (day) derivative of CCDF

  16. peer arrival rate: inter-arrival time: seed leaving rate: seed service time: downloading rate: downloading time: peer n peer n+1 t tn tn+1 Torrent Death Peer n arrives at time tn : Whentn , what will happen? inter-arrival time > seed service time torrent dead

  17. 104 104 trace model trace model 102 torrent lifespan (hour) 102 torrent population 100 100 0 200 400 600 100 101 102 103 rank of torrents torrents Torrent Population and Lifespan Most torrents are small (avg 102) Most torrents are short live (avg 8 days)

  18. Define: Avg downloading failure ratio about 10% Different evolution patterns Small population  large Rfail Reminder: most torrents have small population! Altruistic peers make torrents long live 104 100 population download failure 10-1 102 downloading failure ratio torrent population 10-2 100 10-3 0 200 400 600 torrents ranked in non-ascending order of downloading failure ratio Downloading Failure Ratio

  19. Torrent Evolution: Fluid Model • Existing model (SIGCOMM 04) • Constant arrival rate  = const • Torrent reaches equilibrium • The correct model • Exponentially decreasing arrival rate • Torrent dead finally • Verified by our measurements • Two completely different pictures

  20. Flash crowd Downloader #: exponentially  Seed #: exponentially  Peek time A very short duration Constant arrival model: flat peak Attenuation – a long tail Downloader #: exponentially  Seed #: exponentially  Constant arrival model is far from the reality: no attenuation Torrent death 80 trace model 40 0 100 200 80 trace model 40 0 100 200 time (hour) Torrent Evolution: Modeling Results constant arrival model # of downloaders constant arrival model # of seeds

  21. Snapshot of torrents at time t 104 15 101 105 model trace 10 download speed 10 8 avg download speed (byte/sec) 103 6 # of peers 5 4 101 2 0 downloader seed 50 100 150 200 time (hour) 0 50 100 150 200 torrents Performance Stability Evolution over time avg download speed (byte/sec) Only stable when torrent is large Fluctuate significantly after peak time Larger torrents have higher and more table performance

  22. 102 106 102 103 + contribution ratio + contribution ratio 102 100 # of torrents 100 peer contribution ratio peer contribution ratio download speed (byteps) 104 101 10-2 –x– # of torrents –x–download speed 10-2 100 ranked peers ranked peers 0 0.2 0.4 0.6 0.8 1 102 0 0.2 0.4 0.6 0.8 1 Contribution ratio: uploaded bytes downloaded bytes Service Unfairness • Unfairness:  download speed,  uploading contribution • Seeds serve high speed downloaders first • Peers not willing to serve after downloading • Not due to new file downloading: selfish

  23. Single-torrent Model : Summary • Torrent evolution over time • Exponentially decreasing arrival rate • Flash crowd – short peak – long tailed attenuation • BitTorrent Limitations • Content availability: torrent death • Performance stability • Service fairness

  24. Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Traffic pattern and user behavior • Graph based model of inter-torrent relation • Inter-torrent collaboration • Conclusion

  25. avg # of torrents a peer requests torrent request rate peer birth rate = =constant Multi-torrent Environment Dynamics Torrent birth Request arrival Peer birth CDF of torrents CDF of peers CDF of requests ------ raw data ------ linear fit ------ raw data ------ asymptotic fit ------ raw data ------ linear fit Torrent birth time, request arrival time, and peer birth time (hour) • Considering peers and torrents on the Internet as an open system • Torrent birth rate, torrent request rate, and peer birth rate are constant • Implication: • The lifecycle of a BT peer: downloading, seeding, sleeping, …, dead

  26. Peer Request Pattern: Request Rate Peer request rate: requests by a peer to different torrents per unit time 102 108 101 104  r (day) # of torrents Assume –x– # torrents + r 100 100 r  77 years ! 0 2000 4000 peers • Peer request process: seems Poisson-like • Request a new torrent with a probability p: participation probability • Dead with probability 1-p

  27. ––– raw data ––– linear fit 40 number of torrents (m) 20 0 100 102 104 peer rank (logi) Peer Request Pattern: Participation Probability Probability model peers request at least m torrents p = 0.8551 Another estimation of p Probability model confirmed

  28. i j Inter-torrent Relation Graph: How Torrents Can Help with Each Other? some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i

  29. i j trace model torr size weighted out-degree torrents torrent size (# of online peers) trace model torr size weighted in-degree torrents Inter-torrent Relation Graph: How Torrents Can Help with Each Other? • Edge weight Wi,j: number of such peers some peers in torrent i have downloaded j 1 i j 2 some peers in torrent j have downloaded i

  30. Single-torrent vs. Multi-torrent Model • Single-torrent model •  seed service time, download failure rate • Limited seed service time , but inter-arrival time  exponentially • Small improvement • Multi-torrent model • Old peers come back multiple times •  peer arrival rate, peer inter-arrival time • Significant improvement

  31. Single-torrent vs. Multi-torrent Model Single-torrent model Multi-torrent model 0.1 seeds stay 10 times longer: *=/10 torrent death ' (T'life)= 0.01 110-6 ≈ 0 Inter-torrent collaboration is much more effective than stimulating seeds to serve longer

  32. Outline • BitTorrent mechanism and our methodology • Modeling and characterization of single-torrent system • Modeling and characterization of multi-torrent system • Inter-torrent collaboration • Tracker site overlay • Instant incentive for collaboration • Conclusion

  33. Tracker Site Overlay B Neighbor-in torrents that can serve me B C A Neighbor-out torrents that I can serve (peer list) D D C • Self-organized P2P network (a logical structure) • An instance ofinter-torrent relation graph • A built-in mechanism for content search, cover 99%+ torrents • Trackerless BitTorrent: uses DHT to store meta file

  34. file A file D Jack Thanks Jack! Incentive for Inter-Torrent Collaboration B A C D Tom Instantincentive – similar to “tit-for-tat” principle • Neighboring cycle detection • Neighboring cycle construction • Bandwidth trading: get one chunk, serve multiple peers

  35. Conclusion • Extensive analysis and modeling to study the behaviors of BT-like systems • Tracker trace and .torrent downloading trace • Mathematical model • BitTorrent system has its limitations due to exponentially decreasing peer arrival rate • Service availability, performance stability, and fairness • Graph based multi-torrent model • System design for inter-torrent collaboration

  36. Thank you!

  37. Backup for Questions

  38. torrent lifespan 104 trace model torrent lifespan (hour) 102 100 0 200 400 600 torrents Torrent Lifespan • Extract tandtfrom trace • Get 0 and  using linear regression • Lifespan model verified by measurement

  39. 104 trace model 102 torrent population 100 100 101 102 103 rank of torrents (in non-ascending order of modeling results) Torrent Population Total population • Model verified by measurement • Observations: • The population of most torrents are small (102 in average) • Downloading failure ratio • Small population  large Rfail

  40. Torrent Evolution: Fluid Model Basic equation set Parameters Resolution

  41. Peer Request Pattern: Summary • Multi-torrent environment: an open model • Torrent birth rate: 0.9454 per hour (nearly a constant) • Peer birth rate: 19.37 per hour (nearly a constant) • Torrent request rate (for all peers over all torrents): 133.39 per hour (nearly a constant) • Actually increase slowly according to BigChampagne • Peer request pattern • Lifecycle: downloading, seeding, sleeping, …, next req with prob. p • Peer participation probability: 0.85 • Request rate (for different torrents by a peer): Poission-like

  42. Tracker Site Overlay • Table size • Node degree distribution • Similar to unstructured P2P networks • Many content search and msg routing algorithms • Flooding • Random walk • … • Trackerless BitTorrent: uses DHT to store meta file

  43. Simulation Experiments without inter-collaboration with inter-collaboration performance stability service fairness content availability downloading failure ratio downloading speed contribution ratio Rfail 0 more stable more balanced Inter-torrent collaboration can improve BitTorrent performance

More Related