490 likes | 773 Views
Advanced Data Structure. Hackson Leung 2009-01-10. Outline. Review General Array Linked List Sorted Array Main dish Binary Search Tree Heap (aka Priority Queue) Hash Table Dessert?. Review. Something you should have known. Review. Data Structure – way to store data
E N D
Advanced Data Structure Hackson Leung 2009-01-10
Outline • Review • General Array • Linked List • Sorted Array • Main dish • Binary Search Tree • Heap (aka Priority Queue) • Hash Table • Dessert?
Review Something you should have known
Review • Data Structure – way to store data • GOOD or BAD? • No absolute answer • Operations Supported • Find an element T inside • Find the minimum element inside • Insert a new element T in it • Remove the Tth element inside • Remove the minimum element inside
Review • GOOD or BAD? • No absolute answer • WHY? • Bringing small money is troublesome • Most shops accept money • Octopus is convenient in payment • Not many shops accept it • Money OR Octopus? Depends • Which structure is the best? • Also Depends!
Review • GOOD or BAD? • No absolute answer IF we consider many • Definite IF only consider one operation • For one operation, time complexity • Still remember that? • O(1), O(lgN), O(N), O(NlgN), O(N2)...
Review - Array • If you don’t know that, you won’t be here • Find an element T • Search the whole array, O(N) • Find the minimum • Search the whole array, O(N) • Insert an element T • A[cnt] = T; cnt = cnt + 1; O(1) • Remove the Tth element • O(1). Why? • Remove the minimum element • O(N). Why?
Review – Linked List • Find an element T • Search the whole list, O(N) • Find the minimum • Search the whole list, O(N) • Insert an element T • O(1). Why? • Remove the element with pointer T • O(1), with better implementation. • Remove the minimum element • O(N). Why?
Review – Sorted Array • Find an element T • Binary search, O(lgN) • Find the minimum • Still need to search?!?! No way! O(1) • Insert an element T • Where to add? O(lgN) • Maintain sorted, O(N) shift • Remove the Tth element • Delete + Shift, O(N) • Remove the minimum element • Delete + Shift, O(N) (Can it be O(1)???)
Review – Summery • If we perform a few number of operations, fine • If we perform a lot...... • NONE ARE EFFICIENT!
Binary Search Tree I suppose you have learnt tree before
Binary Search Tree • Make use of Binary Tree • Binary Tree? At most 2 children! • One more property • Left Subtree < Node < Right Subtree 11 8 15 4 9 20
Binary Search Tree • Insert 11, 8, 15, 9, 20, 4 11 8 15 4 9 20
Binary Search Tree • Insert 11, 8, 15, 9, 20, 4 11 8 15 4 9 20
Binary Search Tree • Find 9 11 8 15 4 9 20
Binary Search Tree • Find 14 11 8 15 4 9 20
Binary Search Tree • Remove • Leaf Node Very easy • Single Child Still easy • Push that child upward • 2 Children? • Find Max(Min) at Left(Right) Subtree • Replace to that node • For that removal of node... • Leaf? • Single? • 2 Children?
Binary Search Tree • Remove • Leaf Node Very easy • Single Child Still easy • Push that child upward • 2 Children? • Replace by Min(Max) at Left(Right) Subtree • Either leaf of single child case • Declare that node to be deleted instead of actually deleting (Lazy Deletion)
Binary Search Tree • Remove • Declare that node to be deleted instead of actually deleting (Lazy Deletion) • Delete 11 11 8 15 4 9 20
Binary Search Tree • Remove • Declare that node to be deleted instead of actually deleting (Lazy Deletion) • Delete 11 8 15 4 9 20
Binary Search Tree • Summary • Insert is similar to Find • Find Minimum: Left Most • Remove Min: Find Min + Remove • Complexities • Find: O(lgN) • Find Min: O(lgN) • Remove Min: O(lgN) • Insert: O(lgN) • Remove: O(lgN)
Binary Search Tree • World of lgN! • It is supposed to be lgN only. • Upper bounds...? • Example...? • Reason...? • Solution • AVL Tree, Red Black Tree • By using rotations • Rarely used + difficult to implement
Heap Priority is given to...
Heap (Priority Queue) • Binary Tree (Usually complete) • Queue only supports • Enqueue (Insert) • Dequeue (Remove Min) • One property • Node value is greater(smaller) than all its descendants • Max(Min) Heap
Heap (Priority Queue) • How to implement? Array! • Suppose 1-based array • For an node at Tth position • Parent = T/2 • Left = 2T • Right = 2T+1
Heap (Priority Queue) • Enqueue • Just insert at the end • Shift up until property is satisfied
Heap (Priority Queue) • Dequeue • Replace top element with the last • Shift down until property is satisfied
Heap (Priority Queue) • Build a heap • Each node’s height: lgN • For each node, do shift down • O(NlgN)??? • For a binary tree with N nodes • There are at most ceil(N/2h+1) nodes with height h • Complexity in terms of h • = O(2N) • Therefore it is O(N)
Heap (Priority Queue) • Summary • Find is not supported • O(N) search whole array • Is remove supported? • Any subtree is also a heap • Remove Min on that subtree • Complexity • Find: O(N) • Find Min: O(1) • Remove Min: O(lgN) • Insert: O(lgN) • Remove: O(lgN)
Hash Table Memory is finite, but input range is NOT
Hash Table • Simple Problem • Mark Six result: 6 integers in [1..49] • Check if each number matches my bet • Most efficient approach? • Answer • Use boolean array with 49 cells • Check if that cell is set for each number • O(1)
Hash Table • Simple Problem...? • Mark Six result: 6 integers in [1..232] • Check if each number matches my bet • Most efficient approach? • Answer...? • Use boolean array with 232 cells • 4GB memory already • Even O(1) checking... Too much memory required
Hash Table • Suppose we are not checking numbers but strings instead... • Suppose we are not checking numbers but images instead... • Solution • Compress the range by Hash Function • Convert the range into an integer
Hash Table • Idea • We need to store values between 0 to 99, but we have only 10 cells • We can compress the range [0, 99] to [0, 9] by taking the modulo 10. It is called Hash Value • Insert, Find and Deleting are O(1)
Hash Table • Problem • What if inserting both 11 and 31? • Collision problem • Solutions • Chaining (Open Hashing) • Open Addressing (Close Hashing)
Hash Table - Chaining • Each cell becomes a linked list • On average, Insert/Find/Remove takes O(1+α), where α is the Load Factor = # stored / # cells • For random Hash Function, usually average case
Hash Table – Open Addressing • Find until a blank cell is reached • Remove must use Lazy Deletion or Find may fail in some cases
Hash Table – Open Addressing • Previous method is called Linear Probing • Quadratic Probing is also popular • Table size MUST be prime • In OI, Linear Probing is enough • Complexities • Find: O(1 / 1-α) • Insert: O(1 / 1-α) • Remove: O(ln(1 / 1- α)/α + 1/α)
Hash Table – Summary • Find Min and Remove Min is usually not supported unless you search the whole tree or array • Complexities (Chaining) • Find: O(1 + α) • Insert: O(1 + α) • Remove: O(1 + α) • Complexities (Open Addressing) • Find: O(1 / 1-α) • Insert: O(1 / 1-α) • Remove: O(ln(1 / 1- α)/α + 1/α) • If α is small (<50%), almost O(1)
Summary Compare what we have learnt
Summary • Choice of data structures are depending on the problem nature • What operations are intensive is the critical factor of choosing them
Extras Read only if you have digested main dishes
Extra Topics • Double Hashing • Use of two hash functions • Binary Indexed Tree • IOI 2001 Mobiles • Segment Tree • IOI 1998 Picture • Quad Tree • Rarely used
Tasks Practise makes Perfect
Tasks • HKOJ 1020 – Left Join • HKOJ 1021 – Inner Join • HKOJ 1019 – Addition II • HKOJ 1090 – Dilligent • HKOJ 3061 – Tappy World • NOI 2004 Day 1 - Cashier
Q & A Curiosity?