Disjoint Sets and Advanced Tree Topics

Disjoint Sets and Advanced Tree Topics HKOI Training 2007 14 Apr 2007 Acknowledgement: Presentation Modified from “Minimum Spanning Trees” by Liu Chi Man (cx), 25 Mar 2006

Prerequisites • Asymptotic complexity • Set theory • Elementary graph theory • Priority queues (or heaps)

Graphs • A graph is a set of vertices and a set of edges • G = (V, E) • Number of vertices = |V| • Number of edges = |E| • We assume simple graph, so |E| = O(|V|2)

Roadmap • What is a tree? • Disjoint sets • Minimum spanning trees • Various tree topics

Trees in graph theory • In graph theory, a tree is an acyclic, connected graph • Acyclic means “without cycles”

Properties of trees • |E| = |V| - 1 • |E| = (|V|) • Between any pair of vertices, there is a unique path • Adding an edge between a pair of non-adjacent vertices creates exactly one cycle • Removing an edge from the tree breaks the tree into two smaller trees

Definition • The following four conditions are equivalent: • G is connected and acyclic • G is connected and |E| = |V| - 1 • G is acyclic and |E| = |V| - 1 • Between any pair of vertices in G, there exists a unique path • G is a tree if at least one of the above conditions is satisfied

ancestors root parent siblings descendents children Recall the Terminology

Other properties of trees • Bipartite • Planar • A tree with at least two vertices has at least two leaves (vertices of degree 1)

The Union-Find problem • N balls initially, each ball in its own bag • Label the balls 1, 2, 3, ..., N • Two kinds of operations: • Pick two bags, put all balls in these bags into a new bag (Union) • Given a ball, find the bag containing it (Find)

The Union-Find problem • An example with 4 balls • Initial: {1}, {2}, {3}, {4} • Union {1}, {3} {1, 3}, {2}, {4} • Find 3. Answer: {1, 3} • Union {4}, {1,3} {1, 3, 4}, {2} • Find 2. Answer: {2} • Find 1. Answer {1, 3, 4}

Disjoint sets • Disjoint-set data structures can be used to solve the union-find problem • Each bag has its own representative ball • {1, 3, 4} is represented by ball 3 (for example) • {2} is represented by ball 2

Implementation 1: Naive arrays • Bag[x] := representative of the bag containing x • <O(N), O(1)> • Union takes O(N) and Find takes O(1) • Slight modifications give <O(U), O(1)> • U is the size of the union • Worst case: O(MN) for M operations

Implementation 1: Naive arrays • How to union Bag[x] and Bag[y]? • Z := Bag[x] For each ball v in Z do Bag[v] := Bag[y] • Can I update the balls in Bag[y] instead? • Rule: Update the balls in the smaller bag • O(MlgN) for M union operations

6 1 3 5 4 7 2 Implementation 2: Forest • A forest is a collection of trees • Each bag is represented by a rooted tree, with the root being the representative ball Example: Two bags --- {1, 3, 5} and {2, 4, 6, 7}.

Implementation 2: Forest • Find(x) • Traverse from x up to the root • Union(x, y) • Merge the two trees containing x and y

1 2 3 4 1 2 4 3 1 2 3 4 1 2 3 4 Implementation 2: Forest Initial: Union 1 3: Union 2 4: Find 4:

1 2 3 4 1 2 3 4 Implementation 2: Forest Union 1 4: Find 4:

Implementation 2: Forest • How to represent the trees? • Leftmost-Child-Right-Sibling (LCRS)? • Too complicated • Parent array • Parent[x] := parent of x • If x is a tree root, set Parent[x] := x

Implementation 2: Forest • The worst case is still O(MN) for M operations • What is the worst case? • Improvements • Union-by-rank • Path compression

Union-by-rank • We should avoid tall trees • Root of the taller tree becomes the new root when union • So, keep track of tree heights (ranks) Bad Good

Path compression • See also the solution for Symbolic Links (HKOI2005 Senior Final) • Find(x): traverse from x up to root • Compress the x-to-root path at the same time

The root is 3 3 3 3 5 5 1 5 1 6 1 4 6 6 The root is 3 7 2 4 4 The root is 3 7 7 2 2 Path compression • Find(4)

U-by-rank + Path compression • We ignore the effect of path compression on tree heights to simplify U-by-rank • U-by-rank alone gives O(MlgN) • U-by-rank + path compression gives O(M(N)) •  : inverse Ackermann function • (N)  5 for practically large N

Minimum spanning trees • Given a connected graph G = (V, E), a spanning tree of G is a graph T such that • T is a subgraph of G • T is a tree • T contains every vertex of G • A connected graph must have at least one spanning tree (why?)

Minimum spanning trees • Given a weighted connected graph G, a minimum spanning tree T* of G is a spanning tree of G with minimum total edge weight • Is it unique? • Application: Minimizing the total length of wires needed to connect up a collection of computers

Minimum spanning trees • Two algorithms • Kruskal’s algorithm • Prim’s algorithm

Kruskal’s algorithm • Choose edges in ascending weight greedily, while preventing cycles

Kruskal’s algorithm • Algorithm • T is an empty set • Sort the edges in G by their weights • For (in ascending weight) each edge e do • If T {e} is acyclic then • Add e to T • Return T

Kruskal’s algorithm • How to detect a cycle? • Depth-first search (DFS) • O(V) per check • O(VE) overall • Disjoint set • Vertices are balls, connected components are bags

Kruskal’s algorithm • Algorithm (using disjoint-set) • T is an empty set • Create bags {1}, {2}, …, {V} • Sort the edges in G by their weights • For (in ascending weight) each edge e do • Suppose e connects vertices x and y • If Find(x)  Find(y) then • Add e to T, then Union(Find(x), Find(y)) • Return T

Kruskal’s algorithm • The improved time complexity is O(ElgV) • The bottleneck is sorting

Prim’s algorithm • In Kruskal’s algorithm, the MST-in-progress scatters around • Prim’s algorithm grows the MST from a “seed” • Prim’s algorithm iteratively chooses the lightest grow-able edge • A grow-able edge connects a grown vertex and a non-grown vertex

Prim’s algorithm • Algorithm • Let seed be any vertex, and Grown := {seed} • Initially T is an empty set • Repeat |V|-1 times • Let e=(x,y) be the lightest grow-able edge • Add e to T • Add x and y to Grown • Return T

Prim’s algorithm • How to find the lightest grow-able edge? • Check all (grown, non-grown) vertex pairs • Too slow • Each non-grown vertex x keeps a value nearest[x], which is the weight of the lightest edge connecting x to some grown vertex • Nearest[x] =  if no such edge

Prim’s algorithm • How to use nearest? • Grow the vertex (x) with the minimum nearest-value • Which edge? Keep track on it! • Since x has just been grown, we need to update the nearest-values of all non-grown vertices • Only need to consider edges incident to x

Prim’s algorithm • Try to program Prim’s algorithm • You may find that it’s very similar to Dijkstra’s algorithm for finding shortest paths! • Almost only a one-line difference

Prim’s algorithm • Per round... • Finding minimum nearest-value: O(V) • Updating nearest-values: O(V) (Overall O(E)) • Overall: O(V2+E) = O(V2) time • Using a binary heap, • O(lgV) per Finding minimum • O(lgV) per Updating • Overall: O(ElgV) time

MST Extensions • Second-best MST • We don’t want the best! • Online MST • See IOI2003 Path Maintenance • Minimum bottleneck spanning tree • The bottleneck of a spanning tree is the weight of its maximum weight edge • An algorithm that runs in O(V+E) exists

MST Extensions (NP-Hard) • Minimum Steiner Tree • No need to connect all vertices, but at least a given subset B  V • Degree-bounded MST • Every vertex of the spanning tree must have degree not greater than a given value K

Various tree topics (List) • Center, eccentricity, radius, diameter • Lowest common ancestor (LCA) • Tree isomorphism • Canonical representation • Prüfer code • Counting spanning trees

Supplementary readings • Advanced: • Disjoint set forest (Lecture slides) • Prim’s algorithm • Kruskal’s algorithm • Center and diameter • Post-advanced (so-called Beginners): • Lowest common ancestor • Maximum branching

Disjoint Sets and Advanced Tree Topics

Disjoint Sets and Advanced Tree Topics

Presentation Transcript

Priority Queues and Disjoint Sets

Disjoint Sets Data Structure (Chap. 21)

Disjoint Sets

Chapter 21: Disjoint Sets

Disjoint Sets

Advanced Topics

Advanced Topics and Recent Topics

Disjoint sets

Advanced Topics

CSE 326 Disjoint Sets and Dynamic Equivalence

Advanced Topics

Disjoint Sets

Disjoint Sets

Disjoint Sets Data Structure

Disjoint Sets

Disjoint Sets Data Structure

Disjoint Sets Data Structure CMPS 3013

Minimum Spanning Trees Featuring Disjoint Sets

Disjoint Sets and Advanced Tree Topics

CSE 326 Disjoint Sets and Dynamic Equivalence