Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks

Searching via Your Neighbor’s Neighbor:The Power of Lookahead in P2P Networks Gurmeet Manku Moni Naor Udi Wieder The Weizmann Institute of Science Stanford

The Small World Phenomenaa very brief history • Folklore – People are connected via short chains – • The graph of social networks has small diameter. • Barabasi: belief may have originated from a story by Frigyes Karinthy, 1929 • Quantitative approach initiated by Milgram in the 1960’s - “The six degrees of separation”. • Mathematical modeling: Model a social network by some distribution on graphs. • A precursor of P2P– need to locate a resource in a ‘natural’ network based on partial information. P2P = Peer-to-Peer = a highly dynamic network

Routing in a Small World Common question: do short paths exist? Kleinberg’s algorithmic question: assuming short paths exist, how do people find them?

Modeling Small Worlds • Kleinberg’s model [2000]: • People  points on a two dimensional grid. • Grid edges (short range). • One long range contact chosen with the Harmonic distribution. • probability of (u,v) proportional to 1/d(u,v)2. • Naturally generalizes to k long range links (Symphony [MBR03],[ADS02].). • Naturally generalizes to any dimension. • Captures the intuitive notion that people know people who are close to them.

( ) l £ o g n Modeling Small Worlds • Small World Percolation: • People  points on a two dimensional grid. • Grid edges (short range). • Each edge appears independently with probability = inverse of its distance squared. • Degree of each node . • Originates from long range percolation model. • Shares structural properties with some popular randomizedP2P networks: R-Chord, R-Hypercube, Skip Lists…

k l · o g n 2 ( ( ( ( ) ) ) ) l l l l l £ £ O O ( o g n ) £ o o o o g g g g n n n n k Routing in Small Worlds • Greedy algorithm: move to the node that minimizes the L1 distance to the target.

2 k l · l ( ( ( ( ) ) ) ) o g n l l l l £ O £ O ( ) o g n £ o o o o g g g g n n n n k Properties of Greedy • Simple– to understand and to implement. • Local– If source and target are close, the path remains within a small area. • In some cases – (Hypercube, Chord) – the best we can do. • Not optimal with respect to the degree. • Can Greedy Routing be shortened? • Without compromising the good properties

Neighbor of Neighbor (NoN) Routing • Each node has a list of its neighbor’s neighbors. • The message is routed greedily to the closest neighbor of neighbor (2 hops). • Let w1, w2, … wk be the neighbors of current node u • For each wi find zi, the closet neighbor to target t • Let j be such thatzj is the closest to target t • Route the message from u via wj to zj • Effectively it is Greedy routing on the squared graph. • The first hop may not be a greedy choice. • Previous incarnations of the approach: • Coppersmith, Gamarnik and Sviridenko [2002]: proved an upper bound on the diameter of a small world graph. • No routing algorithm • Manku, Bawa and Ragahavan [2003]: a heuristic routing algorithm in ‘Symphony’ - a Small-World like P2P network.

2 2 l l ( ( ( ( ) ) ) ) l l l l l l £ £ £ £ ( ( ) ) o o g g n n k £ £ ( ( o o g g n n ) ) £ £ o o o o g g g g n n n n l l l l k l k k o o g g o o g g n n o g What can we show about Non Greedy • PSW, R-Chord, R-Hypercube are degree optimal w.h.p. • Skip Lists – degree optimal on expectation. • Kleinberg’s model and P2P variations – improved. • Lower bounds for algorithms based on neighbor lists only (Greedy is a special case).

Degree Optimal P2P Routing • Different routing schemes • Viceroy [MNR02]: emulates the butterfly network • Constant degree • O(log n) hops for routing • Constructions emulating De-Bruijn graphs • Can achieve any degree/number of hops tradeoff • In particular degree O(log n) and O(log n/ log log n) hops • Routing is not greedy • Recent construction [AM] fixes that. • Even if target and source are close in label space message might be routed away • No (natural) prefix search • Random keys are necessary.

0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Skip – Graphs [AS02],[HDJ+03] • Each node (resource) has a name. • Nodes are arranged on a line sorted by name. b a c f d e • Each node chooses a random string of bits. • An edge is established if two nodes share a prefix which is not shared by the nodes between them. • Allows prefix search.

0 1 0 0 1 1 0 0 1 0 1 0 1 1 0 1 1 0 Theorem: Using the NoN algorithm, the expected path length of any lookup is . Routing in Skip – Graphs • Greedy Routing – use longest edge possible. • Path length is (log n) w.h.p. • The NoN algorithm optimizes over two hops.

= d l d o g = ( = ) d l l d l l O d 1 [ ] [ ( ) j ] o o g g n 0 o g o g n X D 0 ¸ > l d ; 2 o g ( = ) l l l O o g n o g o g n Skip Graphs – degree optimality • Call a NoN 2-hop successfulif it reduces the distance from d to . • Need succesful 2-hops to get to distance 1. • From Lemma, this would take in expectation. d 0 X - # of two hop paths between d and D - the event a message reached the node d. Lemma: Prob Sufficiency of lemma:

1 [ ( ) j ] X D 0 ¸ > = d l d 2 o g j j k k i j 1 ¡ ¡ ¡ ¡ ( ) 2 1 2 ¡ ¢ d [ ] 0 l d ; o g [ ] c c P A · · 1 2 n r d i j c j j j j X i j i j ¡ ¡ 1 ; [ ] E X 5 ¸ ¸ Lemma: ¢ ( ) l d i i ¡ o g n i 1 = Proof: For prefix of length k the probability of an edge is: Let k be log(|i-j|). Skip Graphs – degree optimality d 0 X - # of two hop paths between d and Want to show Prob . Ignore dependence on D. Ai,j - There exists an edge between i, j. Choice of constants

1 2 [ ] [ ] [ ] X E X E X · + v a r 1 [ ] [ ] [ ] 2 A A P A P A · ¢ c o v r r d d d d i j i j d x y x y [ ; ] 2 0 ; ; ; ; ; ; ; ; l d ; 2 1 o g [ ] [ ] E X E X + [ ] P X 0 0 7 · · 2 r = [ ] 2 E X : Careful calculation: deal with dependencies Which implies: Skip Graphs – degree optimality X - # of two hop paths between d and Ai,j - There exists an edge between i, j. 0 d i j x y

The Cost/Performance of NoN • Cost of Neighbor of Neighbor lists: • Memory: O(log2n) - marginal. • Communication: Is it tantamount to squaring the degree? • Neighbor lists should be maintained (open connection, pinging, etc.) • NoN lists should only be kept up-to-date. • Reduce communication by piggybacking updates on top of the maintenance protocol. • Lazy updates: Updates occur only when communication load is low – supported by simulations. Networks of size 217 show 30-40% improvement

Simulation Results Small World - one dimension Skip Graphs

Simulation Results 2-dimensional small world 1-dimensional Small World each edge fails with probability 1/2

( ( ( ) ) ) l l l o o o g g g n n n A Case for Randomized Topology • Average diameter of hypercube is . • Average diameter of ‘perfect’ skip graph is . • Average diameter of Chord is . • Conclusion– The randomization of edges reduces the average path lengths. • Common design rule – reduce randomization in topology. • The long edges are just in the right density, so that NoN finds them without increasing the degree. • Other advantages: • Security, fault tolerance….

Do People Use the NoN Algorithm? • Experiment based on email [DRW03] • About 25% sent the mail because: • The recipient traveled to target’s geographical region. • The recipient’s family originates from target’s geographical region.

( ) l o g n Theorem: Every 1-local algorithm requires probes w.h.p, both in small worlds and in skip graphs. Lower Bounds – A Probing Model • Goal: Find a path between two nodes in an unknown graph. • The algorithm may probe a node. If the probing reveals a neighborhood of radius k, then the algorithm is k–local. • A lower bound on the number of probes implies a lower bound on the sequential running time of routing. • The Greedy algorithm is 1-local. NoN is 2-local. Conclusion: Some extra information is necessary.

[ ] f k P · r d [ ] [ ] k f k P P · ¸ · r g r d d ( ) f g d d Lemma: For all k;d>0 ; Greedy algorithm dominates1-local algorithms. • Let A be a 1-local algorithm. Denote by the r.v. counting the number of probes it takes A (Greedy) to find a path between 0 and d. i d 0 revealed • If a probe finds node i, reveal all edges (prefixes) in [d;i]. Only increases . • The ‘best chance’ of getting close to 0 is by probing the node closest to 0.

l P [ ] o g n l E X ¸ c o g n i X [ j ] P X X X X 1 1 1 1 ¸ 1 X ¡ P [ ] r l c P X ² = · = · = = i i 0 1 1 i ¡ r c o g n n ; ; : : : ; i 2 i 0 = Lemma: Both for skip graphs and small worlds, there exists a constant c such that: Lower Bounds on Greedy • Partition the nodes to balls B0,B1,…,Blog d • Define Xi– the indicator of the event :“Greedy probed a node in Bi” • The probe complexity is at least . 0 d B2 B0 B1 B3 Azuma’s inequality:

0 d B2 B1 B0 B3 1 ¡ [ j ] P X X X X c 1 1 1 1 ¸ r c = = = = i i 0 1 1 ¡ ; ; : : : ; Lemma: Both for skip graphs and small worlds, there exists a constant c such that: Lower Bounds on Greedy • Xi depends only on the last ball visited. • When a ball is visited – skip to the last node. • Assume X0=1,X1=0. • The probability the dangling edge would skip over B2 is at most .

Conclusions • NoN Greedy seems like an almost free tweak that is a good idea in many settings. • Do not be perfect (all the time) – randomization helps. • What is more important • Prefix search. • Easy and ‘natural’ degree optimality. • Better understanding of the ‘small world’ phenomena.

Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks

Searching via Your Neighbor’s Neighbor: The Power of Lookahead in P2P Networks

Presentation Transcript

Molecular Evolution

K-nearest neighbor methods

Physical Evidence

Searching Molecular Databases with BLAST

Algorithms for Nearest Neighbor Search

Hydro Networks in GIS

The Bing , The Bang , and The Bongo

Review for Exam II

Unit V Vocabulary

Bud, Not Buddy By: Christopher Paul Curtis Lesson 2 cont. – Lesson 3

Super-Resolution Through Neighbor Embedding

Please!!!!!

CIS 185 CCNP ROUTE EIGRP Part 1

World War II

NETWORK MODELS

From Smart Dust to Reliable Networks

Benjamin Banneker

Wireless Personal area networks WPANs

Flow Networks

Huawei eRAN 3.0 ANR Feature Introduction

Cellular Wireless Networks

Molecular Evolution