Section 12: Graph Algorithms

Section 12: Graph Algorithms

Shortest Paths • Given a weighted graph, the shortest path between two vertices is the path between them whose edges have the lowest sum.Useful for: • Finding cheap airline itineraries • Finding fast airline itineraries • Network routing

Shortest Path • Types of shortest path problems: • Single Pair: Given vertices v, w find the shortest path from v to w. • Single Source: Given a source vertex s, find the shortest paths from s to every other vertex. • All Pairs: Find the shortest paths between any two vertices. • Dijkstra’s Algorithm efficiently solves single source, which in turn solves single pair

Dijkstra’s Algorithm • The “standard” solution to the single-source problem is Dijkstra’s Algorithm. • Shortest paths are computed with a greedy algorithm: we repeatedly visit the nearest vertex we haven’t seen yet • Doesn’t work on disconnected graphs (e.g. if you can’t get from u to v, there is no shortest path) • Edge weights must be positive

d(s, s) = 0 < d(s, s1) = 5 < d(s, s2) = 6 < d(s, s3) = 8 < d(s, s4) = 15 Dijkstra’s Algorithm Suppose that the shortest distances from s to the other nodes are d(s, s) < d(s, s1) < d(s, s2) < … … < d(s, sn) A shortest path from s to si can’t possibly contain any sj with j > i. It can only use the vertices {s, s1, …, si-1}. Dijkstra’s Algorithm starts with s, then finds s1, then s2, and so on. s4 15 8 s3 2 10 s2 s 3 4 1 5 s1

Dijkstra’s Algorithm • Greedy selection rule: • Assume s, s1, s2, …, si-1 have already been selected, and their shortest distances have been stored • Select node si and save d(s, si)such that out of all nodes not yet selected, si has the shortest path to s using only nodes s, s1, …, si-1. • To do this efficiently, we need to track of the distance of the shortest path from s to v using only s, s1, …, si-1for each unselected v. Call this D[v].

Dijkstra’s Algorithm – Example Solution = {(s, 0)} D[s1] = 5 by path [s, s1] D[s2] = infinity by path [s, s2] D[s3] = 10 by path [s, s3] D[s4] = 15 by path [s, s4] Solution = {(s, 0),(s1, 5)} D[s2] = 6 by path [s, s1, s2] D[s3] = 9 by path [s, s1, s3] D[s4] = 15 by path [s, s4] Solution = {(s, 0),(s1, 5),(s2, 6)} D[s3] = 8 by path [s, s1, s2, s3] D[s4] = 15 by path [s, s4] Solution = {(s, 0),(s1, 5),(s2, 6),(s3, 8),(s4, 15)} s4 15 8 s3 2 10 s2 s 3 4 1 5 s1

Dijkstra’s Algorithm – updating D • Suppose we add some node near to the solution. How does this change D? • All nodes adjacent to near can now be reached through near – when does using near give us a shorter path?for all neighbors v of near not yet selected: if (D[near] + w(near, v) < D[v]) D[v] = D[near] + w(near, v);

Dijkstra’s Algorithm – Update Example Solution = {(s, 0)} D[s1] = 5, D[s2] = infinity, D[s3] = 10, D[s4] = 15 Solution = {(s, 0),(s1, 5)} D[s2] = D[s1] + w(s1, s2) = 5 + 1 = 6 D[s3] = D[s1] + w(s1, s3) = 5 + 4 = 9 D[s4] remains 15 Solution = {(s, 0),(s1, 5),(s2, 6)} D[s3] = D[s2] + w(s2, s3) = 6 + 2 = 8 D[s4] remains 15 Solution = {(s, 0),(s1, 5),(s2, 6),(s3, 8),(s4, 15)} s4 15 8 s3 2 10 s2 s 3 4 1 5 s1

Dijkstra’s Algorithm [For each vertex, we will store a “distance” field d and a “selected” field k. Initially, dv= infinity and kv = 0 for all v. Let s designate the source node.] Step 1: Set ds = 0. Step 2: for (i = 0; i < |V|; i++) { Out of all vertices with selected field = 0, choose the one with the lowest distance field. Call it x. kx = 1; for all neighbors y of x: if (dy > dx + w(x,y)) dy = dx + w(x,y); }

Dijkstra’s Algorithm • The tricky part is choosing the x with the lowest dx . We also have to subsequently update dy for all neighbors y of x.How can we do this efficiently? • Use an array to store the nodes • Use a heap to store the nodes, with the key being the distance field

Dijkstra’s Algorithm – Array • Store all of the vertices in an array. • Choosing the vertex with the smallest distance now requires searching the array; this takes time O(|V|). • Updating the distances takes only constant time for each edge. • Total running time is now O(|V|2 + |E|),which is really just O(|V|2). • This is good for dense graphs.

Dijkstra’s Algorithm – Heap • Store all of the vertices in a min-heap, where the key is the distance field. • Choosing the vertex with the smallest distance amounts to a deleteMin(); this takes time O(log |V|). • Updating the distances requires a call to decreaseKey() each time, and so takes time O(log |V|) for each edge. • Total running time is now O(|V| * log |V| + |E| * log |V|). • This is good for sparse graphs.

New Terminology • Given a graph G(V, E), we say that a graph G’(V’, E’) is a subgraph of G if V’ is a subset of V and E’ is a subset of E. • A tree is a graph T(V, E) with these properties (any two of the below imply the third): • T is connected • T is acyclic (i.e., T has no cycles) • |E| = |V| - 1 • A subgraph of G that contains all of its vertices, and is a tree, is called a spanning tree of G.

Minimum Spanning Trees • Back to undirected, weighted graphs • Define the cost of a spanning tree to be the sum of the weights of all edges involved • The Minimum Spanning Tree (MST) of a graph G is the spanning tree of G with the lowest cost • Two greedy algorithms to find MSTs: • Prim’s Algorithm • Kruskal’s Algorithm

Prim’s Algorithm • Start with two sets of vertices, S and N.S will represent the set of all vertices currently in the MST; N will represent the set of all vertices not in the MST. • Choose some start vertex s. It doesn’t matter which one. • Set S = {s}, N = {[everything else]}, and let the solution MST contain only s and no edges.

Prim’s Algorithm • We’re going to repeatedly add vertices and edges to the MST, until we’ve added every vertex.for (i = 0; i < |V| - 1; i++) { Consider every edge with one endpoint in S and one in N. Choose the one with lowest weight. Add that edge and the endpoint in N to the tree. Add the endpoint from N to S.} • On each iteration, we add a new vertex to the tree in the cheapest way possible.

Prim’s Algorithm [For each vertex, we will store a “distance” field d and a “selected” field k. Initially, dv= infinity and kv = 0 for all v. Let s designate the source node.] Step 1: Set ds = 0. Step 2: for (i = 0; i < |V|; i++) { Out of all vertices with selected field = 0, choose the one with the lowest distance field. Call it x. kx = 1; for all neighbors y of x: if (dy > w(x,y)) dy = w(x,y); }

Prim’s Algorithm • So, Prim’s Algorithm is basically just Dijkstra’s. The difference is that with Dijkstra's algorithm, the "distance" to a node was its distance from the source; with Prim's, it's the distance from anything in the tree. • Store N as a heap, with the key being the distance field. • Choosing the next vertex requires a deleteMin() call, and takes time O(log |V|) • Updating each distance requires a decreaseKey() call, and takes time O(log |V|) • Total running time for the algorithm is now O(|V| * log |V| + |E| * log |V|) • Again, pretty good for sparse graphs

Kruskal’s Algorithm • Initially, include all of the vertices and no edges. All edges are marked “unconsidered”.for (i = 0; i < |V| - 1; i++) { Out of all unconsidered edges, choose the one with the lowest weight. Mark this edge “considered”. If adding the edge to the tree won’t cause a cycle, do it.} • The idea is to repeatedly add the cheapest possible edge that keeps the graph acyclic. Once we've added |V| - 1, we must have a spanning tree (why?)

Kruskal’s Algorithm Step 1: Sort all of the edges by weight. Step 2: for (i = 0; i < |V| - 1; ) { Choose the smallest unconsidered edge Mark it considered if (adding edge doesn’t cause a cycle) { add it i++ } }

Kruskal’s Algorithm • How do we know whether adding an edge to the tree will cause a cycle? • Store all vertices in a disjoint set structure. • When we’re considering edge uv, run a Find() on u and v. If they’re in the same set, then they’re in the same subtree, so adding the edge would cause a cycle. • When we do add an edge uv, union the trees they belong to.

Kruskal’s Algorithm Analysis • Sorting the edges at the beginning takes O(|E| * log |E|) time • The loop iterates no more than |E| times, since each edge will be considered at most once • Each edge requires two Find() calls • We’ll need |V| - 1 Union() calls in total, because we merge all |V| sets into one • Running time is dominated by the edge sorting, so the whole thing is O (|E| * log |E|)

Section 12: Graph Algorithms