1 / 14

Lecture 12 Distributed Hash Tables

Lecture 12 Distributed Hash Tables. CPE 401/601 Computer Network Systems. slides are modified from Jennifer Rexford. Hash Table. Name-value pairs (or key-value pairs) E.g,. “Mehmet Hadi Gunes” and mgunes@cse.unr.edu E.g., “http://cse.unr.edu/” and the Web page

gusty
Download Presentation

Lecture 12 Distributed Hash Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 12Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford

  2. Hash Table • Name-value pairs (or key-value pairs) • E.g,. “Mehmet Hadi Gunes” and mgunes@cse.unr.edu • E.g., “http://cse.unr.edu/” and the Web page • E.g., “HitSong.mp3” and “12.78.183.2” • Hash table • Data structure that associates keys with values value lookup(key) key value

  3. Distributed Hash Table • Hash table spread over many nodes • Distributed over a wide area • Main design goals • Decentralization • no central coordinator • Scalability • efficient even with large # of nodes • Fault tolerance • tolerate nodes joining/leaving

  4. Distributed Hash Table • Two key design decisions • How do we map names on to nodes? • How do we route a request to that node?

  5. Hash Functions • Hashing • Transform the key into a number • And use the number to index an array • Example hash function • Hash(x) = x mod 101, mapping to 0, 1, …, 100 • Challenges • What if there are more than 101 nodes? Fewer? • Which nodes correspond to each hash value? • What if nodes come and go over time?

  6. Consistent Hashing • “view” = subset of hash buckets that are visible • For this conversation, “view” is O(n) neighbors • But don’t need strong consistency on views • Desired features • Balanced: in any one view, load is equal across buckets • Smoothness: little impact on hash bucket contents when buckets are added/removed • Spread: small set of hash buckets that may hold an object regardless of views • Load: across views, # objects assigned to hash bucket is small

  7. Consistent Hashing • Construction • Assign each of C hash buckets to random points on mod 2n circle; hash key size = n • Map object to random position on circle • Hash of object = closest clockwise bucket 0 14 Bucket 12 4 8 • Desired features • Balanced: No bucket responsible for large number of objects • Smoothness: Addition of bucket does not cause movement among existing buckets • Spread and load: Small set of buckets that lie near object • Similar to that later used in P2P Distributed Hash Tables (DHTs) • In DHTs, each node only has partial view of neighbors

  8. Consistent Hashing • Large, sparse identifier space (e.g., 128 bits) • Hash a set of keys x uniformly to large id space • Hash nodes to the id space as well 2128-1 0 1 Id space represented as a ring Hash(name)  object_id Hash(IP_address)  node_id

  9. Where to Store (Key, Value) Pair? • Mapping keys in a load-balanced way • Store the key at one or more nodes • Nodes with identifiers “close” to the key • where distance is measured in the id space • Advantages • Even distribution • Few changes as nodes come and go… Hash(name)  object_id Hash(IP_address)  node_id

  10. Joins and Leaves of Nodes • Maintain a circularly linked list around the ring • Every node has a predecessor and successor pred node succ

  11. Joins and Leaves of Nodes • When an existing node leaves • Node copies its <key, value> pairs to its predecessor • Predecessor points to node’s successor in the ring • When a node joins • Node does a lookup on its own id • And learns the node responsible for that id • This node becomes the new node’s successor • And the node can learn that node’s predecessor • which will become the new node’s predecessor

  12. Nodes Coming and Going • Small changes when nodes come and go • Only affects mapping of keys mapped to the node that comes or goes Hash(name)  object_id Hash(IP_address)  node_id

  13. How to Find the Nearest Node? • Need to find the closest node • To determine who should store (key, value) pair • To direct a future lookup(key) query to the node • Strawman solution: walk through linked list • Circular linked list of nodes in the ring • O(n) lookup time when n nodes in the ring • Alternative solution: • Jump further around ring • “Finger” table of additional overlay links

  14. Links in the Overlay Topology • Trade-off between # of hops vs. # of neighbors • E.g., log(n) for both, where n is the number of nodes • E.g., such as overlay links 1/2, 1/4 1/8, … around the ring • Each hop traverses at least half of the remaining distance 1/2 1/4 1/8

More Related