Performance in Decentralized Filesharing Networks

Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project

Styles of collaboration • Centralized model • e.g. Napster • global index held by central authority (single point of failure) • direct contact between requestors and providers • Decentralized model • e.g. Freenet, Gnutella • no global index – local knowledge only (approximate answers) • contact mediated by chain of intermediaries

Key questions • Does it work? • can we find the data? • query success rates  length of query paths • Does it scale? • logarithmic / linear / polynomial • Is it robust? • participants are unreliable • different failure modes possible

An abstract model • Can model the network as a graph:

Querying the network • Answering a query means finding a path • source = requestor • destination = provider • A distributed search problem! • approximate global solution using local knowledge • same problem as IP routing

The Freenet algorithm • Graph structure actively evolves over time • new links form between nodes • files migrate through network  adaptive routing

Initial simulations • Ring topology, 1000 nodes:

Initial simulations (cont’d)

Why does it work? • The small-world model • Milgram: six degrees of separation • Watts: between order and randomness • short-distance clustering + long-distance shortcuts

P(n) ~ 1/n1.5 Links in the small world • “Scale-free” link distribution • P(n) = 1/nk • most nodes have only a few connections • some have a lot of links

Small-world links (cont’d) • Real-world examples • movie actors (Kevin Bacon game) • world-wide web • nervous system of wormC. elegans

The importance of routing • Existence of short paths is not enough – they must be found • Adaptivity helps Freenet find good paths • Compare: a random-routing network

Scalability • Real-world networks are much larger • nearly 400,000 downloads of Freenet • 50 million Napster users • How well does Freenet scale?

Fault-tolerance • Unreliability is normal in peer-to-peer • Two types of failure: • random failure • targeted attack

Random failure

Targeted attack

To do • Variable disk/bandwidth capacity • if you build it, will they come? • Participants leaving and re-entering • File lifetimes • “lifetime” is relative • relationship between ease of retrieval and popularity, size • impact of splitting and combining

Conclusions • Local approximations can be good enough • Small-world model provides useful framework • Metrics to consider: • query pathlength • clustering coefficient • link distribution • Issues to consider: • scalability • fault tolerance under various scenarios

For more information • “Performance” chapter in Peer-to-Peer • I. Clarke, O. Sandberg, B. Wiley, T.W. Hong, “Freenet: a distributed anonymous information storage and retrieval system,” in Workshop on Design Issues in Anonymity and Unobservability, ed. by H. Federrath. Springer: New York (2001)

Performance in Decentralized Filesharing Networks

Performance in Decentralized Filesharing Networks

Presentation Transcript

Performance Evaluation for Decentralized Operations

Performance Evaluation for Decentralized Operations

Performance Measurement in Decentralized Organizations

Performance Evaluation in the Decentralized Firm

Complex networks and decentralized search algorithms

Value-based requirements engineering in decentralized value networks

Performance networks

Performance Measurement in Decentralized Organizations

Performance Evaluation for Decentralized Operations

Performance Evaluation for Decentralized Operations

Performance Monitoring in Photonic Networks

Coordination middleware for decentralized applications in dynamic networks

Algorithmic Performance in Complex Networks

Decentralized Resource Allocation in Application Layer Networks

Performance Measurement in Decentralized Organizations

Performance Evaluation for Decentralized Operations

Cross-Layer Performance Analysis for Decentralized Multihop Wireless Networks

Complex networks and decentralized search algorithms

Overlay Network Construction in Highly Decentralized Networks

Redundancy in High Performance Networks

Performance Measurement in Decentralized Organizations

Other filesharing software