1 / 21

Increasing the Reliability of DHTs using MultiRouting

Increasing the Reliability of DHTs using MultiRouting. James Newell CS598ig Scattered Systems Jan 28, 2005. Distributed Hash Tables. DHTs ID space spans over many nodes Stores objects on nodes (files, address, caches) Routes key to correct node storing replica. 0x00 – 0x1F. 0x20 – 0x3F.

rosine
Download Presentation

Increasing the Reliability of DHTs using MultiRouting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Increasing the Reliability of DHTs using MultiRouting James Newell CS598ig Scattered Systems Jan 28, 2005

  2. Distributed Hash Tables • DHTs ID space spans over many nodes • Stores objects on nodes (files, address, caches) • Routes key to correct node storing replica 0x00 – 0x1F 0x20 – 0x3F 0x40 – 0x5F 0x60 – 0x7F

  3. Overcoming Churn • Routing tables are susceptible to churn, node and link failures • Standard techniques to mitigate problem • Retrying alternate routes • Adding additional replicas • Result: High complexity with mediocre improvement • Constrained to underlying ID space • Routing infrastructure is difficult to modify

  4. MultiRouting • Build an additional layer on top of multiple DHT substrates • Independent replica placement • Differing routing behavior • Predict which substrate will most likely succeed • Adaptive properties to increase availability during transient failures • Increased lookup performance by exploiting opposing DHT strengths

  5. Design Description • User application interfaces with MultiRouter API • Multiple underlying DHTs • Transparent to application • Run concurrent and independent • Customizable DHT combinations • “Plug-and-Play” ensures compatibility with traditional networks User App MultiRouter DHT 1 DHT 2 DHT 3 Physical Layer

  6. MultiRouter User Operations • Simple user operations • Join network • Insert object key • Lookup object key • Remove object key Once the join is finished, it can now handle inserts and queries The MultiRouter uses the DHT(s) that it predicts will perform the best For inserts, the MultiRouter replicates object keys on all DHTs to improve availability Simultaneously joins all networks

  7. MultiRouter Lookups • How does the MultiRouter decide which DHT(s) to use? • Previous object metrics are maintained in a StatsTable • Stats are entered into a cost function CDHT(x) • A set of rules R interprets the results of the cost functions • The rules return a BOOL that indicates the set of DHT(s) that should be included

  8. Formal Notation • Given an array of statistical data di • M is a set of Metric functions (m1, m2,…mn) M where mi(di, t) → Real • The cost function for DHTi is Ci = 1(m1) + 2(m2) + … + n(mn) • R is the set of rules s.t. Ri(C, Ci)→Bool, where C is the set of all cost functions. • The union of all rules is the final set of DHT(s)

  9. MultiRouter StatsTable • Information stored in the StatsTable is customizable to the application. • Metrics are updated upon a query request or timeout • StatsTable is initially cold but could be optionally warmed-up by neighbor Example Metrics

  10. Implementation • Fully developed MultiRouter prototype • Java 1.4.2 • Link-layer discrete-event simulator • FreePastry derivation (including filetuple storage system) • Kelips derivation • Simple MultiRouter overlay • Objects are simple filetuples <key, address>

  11. Network Simulator • Discrete-event, link-layer simulator • Nodes pass full messages after a certain delay • Time measured in discrete rounds (10 msec) • Routing uses an ideal shortest-path algorithm (No routing failures or queuing effects) • Transient and permanent node and link failures • Uses GT-ITM generated trans-stub topologies and an event file.

  12. DHT Substrates • Pastry • Uses routing table, leaf set, neighborhood set, and file table • Pastry routes messages to the node with the closest ID to the key [in ceil(log2b N) hops] • Low message overhead but higher latency and failure rate • Kelips • Uses √N affinity groups and contacts • Actively gossips “heartbeats” to maintain up-to-date information • Constant time lookup and high fault-tolerance, but high message overhead and slow steady-state

  13. MultiRouter Prototype • Similar in design to Kelips and Pastry but interacts at with only DHTs • Inserts filetuples on all DHTs • Uses old information on rejoins • Maintains two metrics: latency and success rate • Common to most applications • Encompass long-term and short-term performance

  14. Metric Functions • Use aging function to smooth out variations: latencyi(l) =  · l + (1 - ) · latencyi-1 • Use decaying function to effectively ignore old failures: failurei(f, t) = f +  · failurei-1 • Strong spikes of failure allows the MultiRouter to respond to failures promptly • Quick decay of failure prevents transient problems from having long-term effects on MultiRouter behavior

  15. Cost and Rule Functions • Each metric holds equal weight • Use rule sets to “interpret” the cost results • Prototype rules • A DHT is cold when • Filetuple was recently added • currenttime - lastsent > threshold

  16. Experimental Results • Use simulation runs and trace-based experiments • Micro-benchmark: “proof-of-concept” • Generic churn • Overnet traced-based churn • Message Overhead Analysis • DHT parameters are consistent across experiments

  17. Micro-benchmark • “Proof-of-concept” of MultiRouter’s adaptability properties • Inserted one filetuple and queried the filetuple every 0.5 sec for 2 mins • MultiRouter uses Pastry until Kelips disseminates the information through the affinity group • MultiRouter has generally lower latency than both DHTs

  18. Generic Churn • Two queries from 100 inserted filetuples every second with two members leave and join every second • MultiRouters success rate is 10% better than Kelips and 35% better than Pastry • Latency is not sacrificed for increased success rate

  19. Overnet-trace Churn • Mapped Overnet trace files to 500 nodes with 100 inserted filetuples • Same querying behavior • Scaled down and varied interval between trace files • Similar results as generic churn

  20. Message Overhead • Increased message overhead is equivalent to the summation of its substrate’s overhead • Unavoidable due to “plug-and-play” aspect • An intelligent choice of DHT combinations does not drastically increase overhead

  21. Conclusion • MultiRouter is a “über-overlay” • Improves both success rate and performance under stressful conditions • Takes advantage of differing properties of independent DHT substrates • Calculates which DHT(s) are most likely successful • Does not drastically increase message overhead • Future work include an expressive metric/rule language and a mechanism to easily “bridge” multiple DHT networks.

More Related