450 likes | 618 Views
Disjoint Sets and Advanced Tree Topics. HKOI Training 2007 14 Apr 2007 Acknowledgement: Presentation Modified from “ Minimum Spanning Trees ” by Liu Chi Man (cx), 25 Mar 2006. Prerequisites. Asymptotic complexity Set theory Elementary graph theory Priority queues (or heaps). Graphs.
E N D
Disjoint Sets and Advanced Tree Topics HKOI Training 2007 14 Apr 2007 Acknowledgement: Presentation Modified from “Minimum Spanning Trees” by Liu Chi Man (cx), 25 Mar 2006
Prerequisites • Asymptotic complexity • Set theory • Elementary graph theory • Priority queues (or heaps)
Graphs • A graph is a set of vertices and a set of edges • G = (V, E) • Number of vertices = |V| • Number of edges = |E| • We assume simple graph, so |E| = O(|V|2)
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Trees in graph theory • In graph theory, a tree is an acyclic, connected graph • Acyclic means “without cycles”
Properties of trees • |E| = |V| - 1 • |E| = (|V|) • Between any pair of vertices, there is a unique path • Adding an edge between a pair of non-adjacent vertices creates exactly one cycle • Removing an edge from the tree breaks the tree into two smaller trees
Definition • The following four conditions are equivalent: • G is connected and acyclic • G is connected and |E| = |V| - 1 • G is acyclic and |E| = |V| - 1 • Between any pair of vertices in G, there exists a unique path • G is a tree if at least one of the above conditions is satisfied
ancestors root parent siblings descendents children Recall the Terminology
Other properties of trees • Bipartite • Planar • A tree with at least two vertices has at least two leaves (vertices of degree 1)
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
The Union-Find problem • N balls initially, each ball in its own bag • Label the balls 1, 2, 3, ..., N • Two kinds of operations: • Pick two bags, put all balls in these bags into a new bag (Union) • Given a ball, find the bag containing it (Find)
The Union-Find problem • An example with 4 balls • Initial: {1}, {2}, {3}, {4} • Union {1}, {3} {1, 3}, {2}, {4} • Find 3. Answer: {1, 3} • Union {4}, {1,3} {1, 3, 4}, {2} • Find 2. Answer: {2} • Find 1. Answer {1, 3, 4}
Disjoint sets • Disjoint-set data structures can be used to solve the union-find problem • Each bag has its own representative ball • {1, 3, 4} is represented by ball 3 (for example) • {2} is represented by ball 2
Implementation 1: Naive arrays • Bag[x] := representative of the bag containing x • <O(N), O(1)> • Union takes O(N) and Find takes O(1) • Slight modifications give <O(U), O(1)> • U is the size of the union • Worst case: O(MN) for M operations
Implementation 1: Naive arrays • How to union Bag[x] and Bag[y]? • Z := Bag[x] For each ball v in Z do Bag[v] := Bag[y] • Can I update the balls in Bag[y] instead? • Rule: Update the balls in the smaller bag • O(MlgN) for M union operations
6 1 3 5 4 7 2 Implementation 2: Forest • A forest is a collection of trees • Each bag is represented by a rooted tree, with the root being the representative ball Example: Two bags --- {1, 3, 5} and {2, 4, 6, 7}.
Implementation 2: Forest • Find(x) • Traverse from x up to the root • Union(x, y) • Merge the two trees containing x and y
1 2 3 4 1 2 4 3 1 2 3 4 1 2 3 4 Implementation 2: Forest Initial: Union 1 3: Union 2 4: Find 4:
1 2 3 4 1 2 3 4 Implementation 2: Forest Union 1 4: Find 4:
Implementation 2: Forest • How to represent the trees? • Leftmost-Child-Right-Sibling (LCRS)? • Too complicated • Parent array • Parent[x] := parent of x • If x is a tree root, set Parent[x] := x
Implementation 2: Forest • The worst case is still O(MN) for M operations • What is the worst case? • Improvements • Union-by-rank • Path compression
Union-by-rank • We should avoid tall trees • Root of the taller tree becomes the new root when union • So, keep track of tree heights (ranks) Bad Good
Path compression • See also the solution for Symbolic Links (HKOI2005 Senior Final) • Find(x): traverse from x up to root • Compress the x-to-root path at the same time
The root is 3 3 3 3 5 5 1 5 1 6 1 4 6 6 The root is 3 7 2 4 4 The root is 3 7 7 2 2 Path compression • Find(4)
U-by-rank + Path compression • We ignore the effect of path compression on tree heights to simplify U-by-rank • U-by-rank alone gives O(MlgN) • U-by-rank + path compression gives O(M(N)) • : inverse Ackermann function • (N) 5 for practically large N
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Minimum spanning trees • Given a connected graph G = (V, E), a spanning tree of G is a graph T such that • T is a subgraph of G • T is a tree • T contains every vertex of G • A connected graph must have at least one spanning tree (why?)
Minimum spanning trees • Given a weighted connected graph G, a minimum spanning tree T* of G is a spanning tree of G with minimum total edge weight • Is it unique? • Application: Minimizing the total length of wires needed to connect up a collection of computers
Minimum spanning trees • Two algorithms • Kruskal’s algorithm • Prim’s algorithm
Kruskal’s algorithm • Choose edges in ascending weight greedily, while preventing cycles
Kruskal’s algorithm • Algorithm • T is an empty set • Sort the edges in G by their weights • For (in ascending weight) each edge e do • If T {e} is acyclic then • Add e to T • Return T
Kruskal’s algorithm • How to detect a cycle? • Depth-first search (DFS) • O(V) per check • O(VE) overall • Disjoint set • Vertices are balls, connected components are bags
Kruskal’s algorithm • Algorithm (using disjoint-set) • T is an empty set • Create bags {1}, {2}, …, {V} • Sort the edges in G by their weights • For (in ascending weight) each edge e do • Suppose e connects vertices x and y • If Find(x) Find(y) then • Add e to T, then Union(Find(x), Find(y)) • Return T
Kruskal’s algorithm • The improved time complexity is O(ElgV) • The bottleneck is sorting
Prim’s algorithm • In Kruskal’s algorithm, the MST-in-progress scatters around • Prim’s algorithm grows the MST from a “seed” • Prim’s algorithm iteratively chooses the lightest grow-able edge • A grow-able edge connects a grown vertex and a non-grown vertex
Prim’s algorithm • Algorithm • Let seed be any vertex, and Grown := {seed} • Initially T is an empty set • Repeat |V|-1 times • Let e=(x,y) be the lightest grow-able edge • Add e to T • Add x and y to Grown • Return T
Prim’s algorithm • How to find the lightest grow-able edge? • Check all (grown, non-grown) vertex pairs • Too slow • Each non-grown vertex x keeps a value nearest[x], which is the weight of the lightest edge connecting x to some grown vertex • Nearest[x] = if no such edge
Prim’s algorithm • How to use nearest? • Grow the vertex (x) with the minimum nearest-value • Which edge? Keep track on it! • Since x has just been grown, we need to update the nearest-values of all non-grown vertices • Only need to consider edges incident to x
Prim’s algorithm • Try to program Prim’s algorithm • You may find that it’s very similar to Dijkstra’s algorithm for finding shortest paths! • Almost only a one-line difference
Prim’s algorithm • Per round... • Finding minimum nearest-value: O(V) • Updating nearest-values: O(V) (Overall O(E)) • Overall: O(V2+E) = O(V2) time • Using a binary heap, • O(lgV) per Finding minimum • O(lgV) per Updating • Overall: O(ElgV) time
MST Extensions • Second-best MST • We don’t want the best! • Online MST • See IOI2003 Path Maintenance • Minimum bottleneck spanning tree • The bottleneck of a spanning tree is the weight of its maximum weight edge • An algorithm that runs in O(V+E) exists
MST Extensions (NP-Hard) • Minimum Steiner Tree • No need to connect all vertices, but at least a given subset B V • Degree-bounded MST • Every vertex of the spanning tree must have degree not greater than a given value K
Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics
Various tree topics (List) • Center, eccentricity, radius, diameter • Lowest common ancestor (LCA) • Tree isomorphism • Canonical representation • Prüfer code • Counting spanning trees
Supplementary readings • Advanced: • Disjoint set forest (Lecture slides) • Prim’s algorithm • Kruskal’s algorithm • Center and diameter • Post-advanced (so-called Beginners): • Lowest common ancestor • Maximum branching