COP 5725 Advanced Database System Presentation

COP 5725 Advanced Database System Presentation Sheng Tan

PageRank on an Evolving Graph BahmanBahmani, Ravi Kumar, Mohammad Mahdian and Eli Upfal. In KDD, pages 24-32, 2012.

Algorithmic paradigms Classical ▪ Stationary data set ▪ Unrestricted Access to data Contemporary ▪ Online algorithm ▪ Streaming algorithms ▪ Sub-linear time algorithms ▪ Algorithmic game theory

Motivating examples Web pages ▪ Millions of hyperlinks modified each day Social networks ▪ Millions of social links modified each day Public opinion

Evolving data The data changes over time ▪ Slow changes ▪ Changes can occur anywhere Keep up with changes and adjusting solution based on new observations ▪ Not feasible to track all the changes ▪ We care about the quality of the solution

Algorithms for evolving data A general model for algorithm design on evolving data ▪ Data slowly changes over time ▪ Goal: compute a function of the data at each point of time ▪ Algorithm can probe specific parts of the data to know about some of the changes First introduced by A. Anagnostopoulos et al, ICALP, 2009 applied on sorting

Formal description At time t, true data = Xt Goal is to compute yt=f(Xt) Input changes slowly(stochastically) such that d(Xt+1,Xt) is small Algorithm can make limited probes into Xt each t to compute a solution y’t Goal is to have y’t ≈ yt

Evolving graph model A sequence of directed graphs over time ▪ Gt = (V,Et) = graph at time t ▪ Nodes do not change(for simplicity) Assume |Et+1 - Et| is small ▪ Choose t fine enough ▪ No change model assumed At time t, algorithm can probe a node u V to get N(u),i.e., all edges in Et of the form(u,v)

Example: Evolving graph G1 G’1

Example(contd) G2 G’2

PageRank: Recap Stationary distribution of the following random walk on a directed graph G= (V,E) If u is the current node ▪ 1 – ε, pick one of the neighbors v of u uniformly at random and move to v ▪ ε, move to a uniformly chosen node from v V π= PageRank of G, π(u) = PageRank of u V Based on assumption authoritative pages usually contain links to good pages

PageRank on evolving graph Probing depends on function and metric ▪ Function = PageRank ▪ Error metric = L ∞ A high PageRank node has more effect on the PageRank of other nodes ▪ πt(v) is large and v adds or deletes an edge at time t => |πt+1(u) - πt(u)| is large Reminder: we do not have access to πt

Proportional probing Embodies the previous intuition Pretend π’tlooks like πt At time t ▪ Pick a node v to probe with probability proportional to π’t-1(v), obtain N(v) ▪ Update G’t using G’t-1 and N(v) ▪ Output π’t, the PageRank vector G’t

Priority probing A deterministic variant of proportional probing Initialize: priorityu=0.0, u V At time t ▪ v= argmaxupriorityu ▪ Probe v and obtain N(v) ▪ Update G’t using G’t-1 and N(v) ▪ Output π’t, the PageRank vector G’t ▪ priorityv=0.0 ▪ priorityu= priorityu+π’t(u), u≠ v

Baseline probing methods Random probing ▪ Probe a node v chosen uniformly at random at each time step Round-robin probing ▪ Cycle through all nodes and probe each node in a round-robin manner We can vary the ratio of changes rate and probing rate

Experiments

Conclusion Evolving graphs is an interesting and useful model for dynamic and massive data By applying priority probing technique we can compute approximate result for PageRank Experiments confirm priority probing achieve better result than random and round-robin probing

Future Direction Can we improve accuracy of priority probing by using existing PageRank information Can we apply evolving graph model to other algorithm such as clustering

Reference BahmanBahmani, Ravi Kumar, Mohammad Mahdianand Eli Upfal. PageRank on an Evolving Graph. In KDD, pages 24-32, 2012.

Thank you ! Questions/Comments

COP 5725 Advanced Database System Presentation

COP 5725 Advanced Database System Presentation

Presentation Transcript

COP-5725 MIDTERM REVIEW

COP-5725 Practice Exercises

COP-5725 Practice Exercises

Advanced Database Topics

Advanced Database Management System

COP 3402 System Software

Advanced Database Discussion

Advanced Database Systems

Advanced Information Modeling and Database System

Advanced Database Systems

COP 3402 System Software

Advanced Database Systems

COP 5725 ADVANCED DATABASE SYSTEM

ADVANCED DATABASE CONCEPTS

COP 3402 System Software

Advanced Database Systems

Advanced Database Systems

Advanced Database Systems

COP 4540 Database Management

Advanced Database Systems

COP-5725 MIDTERM REVIEW

COP-5725 Practice Exercises