170 likes | 442 Views
Scalable Content-Addressable Networks. Prepared by Kuhan Paramsothy March 5, 2007. High-Level Overview. Hash tables (map keys to values) are heavily used in building software applications
E N D
Scalable Content-Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007
High-Level Overview • Hash tables (map keys to values) are heavily used in building software applications • The concept of a Content-Addressable Network (CAN) provides hash table-like functionality on Internet-like scales. • CAN is: • Scalable • Robust/Fault-tolerant • Self-organizing • Low-latency ECE 1770 – Content-Addressable Networks
Hash Tables and CAN • A data structure that efficiently maps keys onto values • CANs are a form of distributed, Internet-scale hash tables. ECE 1770 – Content-Addressable Networks
What CAN would do for us • CAN would improve peer-to-peer systems • Napster: the process of locating a file is centralized • Expensive to scale the central repository, single point of failure • Gnutella: decentralized the file location process (network self-organizes into an application layer mesh) • Requests for files are done through flooding, not scalable, may not find content • Conclusion: P2P systems need a scalable indexing mechanism • CAN would improve large data repositories • These systems need efficient insertion and retrieval • CAN would create large-scale name resolution services that don’t use a naming scheme (ie. Not DNS) • No more location-dependent naming schemes ECE 1770 – Content-Addressable Networks
Basic Operations Performed On CANs • Basic Operations • Insertion (of key,value pairs) • Lookup (of key,value pairs) • Deletion (of key,value pairs) • Each CAN stores • A piece (called a zone) of the entire hash table • Holds information about a small number of adjacent zones in the table • Routing in a CAN • Done by intermediate CAN nodes towards the CAN node whose zone contains that key • CAN Design is • Distributed (requires no centralized control or coordination) • Scalable (nodes hold only a small about of information that doesn’t grow with the network) • Fault-tolerant (nodes can route around failures) • Doesn’t require a naming hierarchy • Is entirely Application Layer ECE 1770 – Content-Addressable Networks
CAN Design • Centers around a virtual d-dimensional Cartesian coordinate space on a d-torus • At any time, the entire coordinate space is dynamically partitioned among all the nodes in the system • Each node owns a distinct zone ECE 1770 – Content-Addressable Networks
CAN Design (2) • To store a pair, key K1 is mapped to P via a uniform hash function • The pair is then stored at the node that owns the zone where P lies • To retrieve an entry corresponding to K1, any node can apply the same hash function to map K1 to P and get the corresponding value • A node learns and maintains the IP addresses of those nodes that hold adjoining coordinate zones Efficient routing is critical to a useful CAN ECE 1770 – Content-Addressable Networks
Routing in a CAN • Routing in a Content Addressable Networks works by following the straight line path through the Cartesian space from source to destination coordinates. • A CAN node maintains a coordinate routing table that holds the IP address and virtual coordinate zone of each of its immediate neighbors in the coordinate space. Average Path Length = (d/4)(n1/d) Individual Nodes Have 2d Neighbors Average Path Length Grows As O(n1/d) ECE 1770 – Content-Addressable Networks
Construction of a CAN Overlay • The entire CAN space is divided amongst the nodes currently in the system • Incremental construction process takes three steps • The new node finds a node already in the CAN • Using the CAN routing mechanisms, finds a node whose zone will be split • The neighbors of the split zone must be notified so that routing can include the new node • Bootstrapping: There are CAN bootstrap nodes associated to a DNS domain name Node Insertion Affects Only O(number of dimensions) existing nodes ECE 1770 – Content-Addressable Networks
Maintenance of a CAN Overlay • Node Graceful Departure: node explicitly hands over its zone and the associated (key,value) database to one of its neighbors • Node Abrupt Disappearance: An immediate takeover algorithm ensures one of the “failed” node’s neighbors takes over the zone • Under normal conditions, a node sends periodic update messages to each of its neighbors and a list of neighbors and their zone coordinates. • Prolonged absence of an update message from a neighbor signals it’s failure ECE 1770 – Content-Addressable Networks
Design Improvements • Basic CAN algorithm provides • Low per-node state (O(d) for a d-dimensional space) • Short path lengths (O(dn1/d) hops for d dimensions and n nodes) • The problem is that there are application-layer hops, not IP-layer hops • Latency of each hop might be substantial ECE 1770 – Content-Addressable Networks
Design Improvements (2) • Improvement: Multi-dimensioned Coordinate Spaces • Increasing the dimensions of the CAN coordinate space reduces the routing path length and path latency for a small increase in the size of the coordinate routing table • Path Length scales as O(d(n1/d)) • Fault-tolerance improves • Improvement: Multiple Coordinate Spaces (a.k.a. Multiple Realities) • Maintain multiple independent coordinate spaces with each node in the system being assigned a different zone in the coordinate space (each coordinate space is a reality) • Fault-tolerance improves • Low per-node state (O(d) for a d-dimensional space) • Short path lengths (O(dn1/d) hops for d dimensions and n nodes) • Which is better? • Increasing the dimensions ECE 1770 – Content-Addressable Networks
Design Improvements (3) • Improvement: Better CAN Routing Metrics • Have each node measure the network-level round-trip-time RTT to each of its neighbors. Then route messages accordingly. • Favors lower latency paths and avoids unnecessarily long hops • Improvement: Caching and Replication • A CAN node can maintain a cache of the data keys it recently accessed • A CAN node can replicate the data key at each of its neighboring nodes • Both schemes need an associated time-to-live field, to eventually expire from the cache ECE 1770 – Content-Addressable Networks
Related Systems • Domain Name System • CANs are more general than the DNS because DNS closely ties the naming scheme to the manner in which a name is resolved to an IP address • Peer-to-Peer • A simple example is keys being analogous to a URL • Will improve robustness • Key difference is that content within the CAN can always be located by any other node because there is a clear “home” (point) in the CAN for that content and every other node knows what the home is how to reach it ECE 1770 – Content-Addressable Networks
Discussion • Security? • Better or worse with CAN? • Any Other Design Improvement? • Is The Communication Overhead Significant? ECE 1770 – Content-Addressable Networks
References • A Scalable Content-Addressable Network, Ratnasamy, University of California – Berkeley, http://www.sigcomm.org/sigcomm2001/p13-ratnasamy.pdf ECE 1770 – Content-Addressable Networks