640 likes | 657 Views
Explore the concept of heaps and heap sort algorithms, their implementation, properties, and applications in computer science. Learn how to represent heaps in arrays and understand priority queues.
E N D
Cairo University Faculty of Computers and Information Data Structures CS 214 2nd Term 2012-2013 Heap and Heap Sort Sections 6.9 and 9.3.2 in Adam Drozdek
Agenda Introduction to Binary Trees Implementing Binary Trees Searching Binary Search Trees Tree Traversal …1. Breadth-First ….2. Depth-First Insertion ……1. Deletion by Merging …..2. By Copying ……2. AVL Trees . Heaps / Heap Sort .
Lecture Outline • Heap Definition • Priority Queues • Heapifying an Array • Heap Sort
إنَّ العبد ليتكلّم بالكلمة -مِنْ رضوان الله- لا يُلْقِي لها بالاً، يرفعه الله بها في الجنة, وإن العبد ليتكلم بالكلمة -من سَخَط الله- لا يُلْقِي لها بالاً، يهوي بها في جهنم))
1- غرقت التيتانك بسبب كلمة فلا تجعل كلمة تغرقك 2- آفات اللسان أربعة عشر 3- ثكلتك أمك يا معاذ و هل يكب الناس فى النار على وجوههم أو على مناخرهم إلا حصائد ألسنتهم
Readings • Definitions of heap: • Section 6.9, Heaps • Section 9.3.2, Heap Sort • Links: • Code http://www.mathcs.duq.edu/drozdek/DSinCpp/sorts.h • Animation http://www.cs.usfca.edu/~galles/visualization/HeapSort.htmlhttp://www.cs.usfca.edu/~galles/visualization/Heap.html
What is a “heap”? Malloc هات حتة ذاكرة new • Book Sections • A large area of memory from which the programmer can allocate blocks as needed, and deallocate them (or allow them to be garbage collected) when no longer needed • A balanced, left-justified binary tree in which no node has a value greater than the value in its parent • These two definitions have little in common • Heapsort uses the second definition
d-2 d-1 d Balanced Balanced Not balanced Balanced binary trees • Recall: • The depth of a node is its distance from the root • The depth of a tree is the depth of the deepest node • A binary tree of depth n is balanced if all the nodes at depths 0 through d-2 have two children
Left-justified Not left-justified Left-justified binary trees • A balanced binary tree is left-justified if: • all the leaves are at the same depth, or • all the leaves at depth d+1 are to the left of all the nodes at depth d
1. Heap • A max heap is a binary tree with the following properties: • The value stored in each node is not less than the value stored in each of each children. • All levels are full, with the possible exception of the bottom level whose nodes are all in the leftmost positions. • A min heap is similar with every node is not great than its children. • Heap propertymeans meeting the first condition above.
Implementing Heaps A heap can be represented by an array. Nodes can ordered in the array cells from top to bottom from left to right. Each node i will have its children at 2 * i+ 1 and 2 * i+ 2. heap[i]heap[2*i + 1]for 0i<(n-1)/2 heap[i]heap[2*i + 2]for 0i<(n-2)/2 The array is not ordered but the descendents along any path are ordered.
Remember Queues • FIFO: First-In, First-Out • Some contexts where this seems right? • Some contexts where some things should be allowed to skip ahead in the line?
2. Priority Queues • Some applications • ordering CPU jobs • simulating events • picking the next search site • Problems? • short jobs should go first • earliest (simulated time) events should go first • most promising sites should be searched first 15
Applications of the Priority Q • Hold jobs for a printer in order of length • Store packets on network routers in order of urgency • Simulate events • Select symbols for compression • Sort numbers • Anything greedy 16
Queues that Allow Line Jumping • Need a new ADT • Operations: enqueue: Insert an Item, dequeue: Remove the “Best” Item 6 2 15 23 12 18 45 3 7 enqueue dequeue
Potential Implementations O(1)/O(N)worst-array full, should say WHY, might reject on full instead. O(N) – to find value O(1) O(N) – to find value O(log N) to find loc w. Bin search, but O(N) to move vals O(1) to find val, but O(N) to move vals, (or O(1) if in reverse order) O(N) to find loc, O(1) to do the insert O(1) O(N) O(N) Plus – good memory usage O(log N)close to O(1) 1.67 levels on average O(log N) Binary Heap
Remember ….. • Use an ADT that corresponds to your needs • The right ADT is efficient, while an overly general ADT provides functionality you aren’t using, but are paying for anyways • Heaps provide O(log n) worst case for both enqueue and dequeue
Priority Queues In a heap, reaching any leaf is O(log n) It can be represented by an array. Nodes can be ordered in the array cells from top to bottom from left to right. Each node i will have its children at 2 * i+ 1 and 2 * i+ 2. heap[i]heap[2*i + 1]for 0i<(n-1)/2 heap[i]heap[2*i + 2]for 0i<(n-2)/2 The array is not ordered but the descendents along any path are ordered.
Heap Enqueue • To enqueue we add the new node as the last leaf and then sift-up the node by moving it to the right place. • heapEnqueue (e1) • put e1 at the end of the heap • while e1 is !root and e1 > parent(e1) • swap e1 with its parent
Heap Dequeue • To dequeueremove the top element and we put the last leaf in its place. • Then we have to restore the heap property. • heapDequeue () • extract the root element • put last leaf in place of the root • p = root • while p is not a leaf and p < any of its children • swap p with the larger child
3. Heapifying an Array Sometimes we have an array and we need to convert it in-place to a heap. Williams’ top-down approach. Floyd’s bottom-up approach.
2 8 8 8 2 2 6 8 8 10 2 6 2 6 8 6 1 1 1 Williams’ Algo 4 Heapifying an Array 2 ______ |_2_|__ _________ |_8_|_2_|__ ____________ |_8_|_2_|_6 |__ 1 2 3 5 10 2 ___________________ |_8_|_10_|_6 |_1 |_2 |__ ___________________ |_8_|_2_|_6 |_1 |_10 |__ 4 _______________ |_8_|_2_|_6 |_1 |__ ___________________ |_10_|_8_|_6 |_1 |_2 |__
Floyd’s Algo 4 Heapifying an Array It does better than William’s It works bottom-up Small heaps are build and then merged Floyed Algo (data[]) for (i = index of last non-leaf; i >= 0 ; i--) restore heap property for tree whose root is data[i] by calling moveDown (data, i, n-1)
1 12 15 12 12 6 11 11 11 11 3 3 15 1 1 1 6 12 Floyd’s Algo 4 Heapifying an Array _________________________________ |_2_|_8_|_6 |_1 |_10 |_ 15 | 3 | 12 |_11 | _________________________________ |_2_|_8_|_6 |_12|_10 |_ 15 | 3 | 1 |_11 | _________________________________ |_2_|_8_|_6 |_12 |_10 |_ 15 | 3 | 1 |_11 | _________________________________ |_2_|_8_|_15 |_12|_10 |_ 6 | 3 | 1 |_11 |
8 12 11 15 15 12 3 11 8 3 6 1 6 1 Floyd’s Algo 4 Heapifying an Array 10 10 _________________________________ |_2_|_ 8 |_15|_12 |_10 |_ 6 | 3 | 1 |_11 | _________________________________ |_2_|_12_|_15 |_11 |_ 10 |_ 6 | 3 | 1 |_8|
2 12 12 15 11 15 6 11 8 3 8 3 2 1 6 1 Floyd’s Algo 4 Heapifying an Array 10 10 _______________________________ |_2_|_12_|_15 |_11 |_ 10 |_ 6 | 3 | 1 |_8| _______________________________ |_15_|_12_|_6 |_11 |_ 10 |_ 2 | 3 | 1 |_8|
Analysis of Floyd’s Algorithm Assume a complete binary tree n = 2k - 1 Move Down is called (n-1)/2, once for each non-leaf In worst case, Move Down moves the next node from next level to last one (to be a leaf). From second to last level to last (leaves) requires (n+1)/4 moves. From third to last to leaves requires 2 moves x (n+1)/8 nodes. For the next level, it is 3 x (n+1)/16 moves.
Analysis of Floyd’s Algorithm This is (n+1) ∑ (1/2) (k / 2k) for k = 1 to (log (n+1))-1 But ∑ (k / 2k) for k = 1 to ∞ converges to 2 Hence, this amount will be (n+1) O(n)
Sorting by Divide and Conquer • Base Case, solve the problem directly if it is small enough • Divide the problem into two or more similar and smaller subproblems • Recursively solve the subproblems • Combine solutions to the subproblems
Priority queue sorting Heap sort Selection sort The family of sorting methods Main sorting themes Address- -based sorting Comparison-based sorting Proxmap Sort RadixSort Transposition sorting BubbleSort Divide and conquer Diminishing increment sorting Insert and keep sorted MergeSort QuickSort Insertion sort Tree sort ShellSort
4. Heap Sort • Merge sort time is O(n log n) but still requires, temporarily, n extra storage items • Heapsort does not require any additional storage • Quick sort is O(n log n) in the average case but is still O(n2) in worst case (which is rare in practice) • Heap sort is + O (n log n) in worst case + In-place • Constant is large and is slower than Qsort in practice. • Not a stable algorithm • Not suitable for external sort / parallelization
Why study Heapsort? • It is a well-known, traditional sorting algorithm that you are expected to know • Heapsort is always O(n log n) • Quicksort is usually O(n log n) but in the worst case slows to O(n2) • Quicksort is generally faster, but Heapsort is better in time-critical applications • Heapsort is a really cool algorithm!
Plan of attack • First, we will learn how to turn a binary tree into a heap • Next, we will learn how to turn a binary tree back into a heap after it has been changed in a certain way • Finally (this is the cool part) we will see how to use these ideas to sort an array
A node has the heap property if the value in the node is as large as or larger than the values in its children All leaf nodes automatically have the heap property A binary tree is a heap if all nodes in it have the heap property 12 12 12 8 3 8 12 8 14 Blue node has heap property Blue node has heap property Blue node does not have heap property The (Max)heap property Chapter 10: Sorting
Given a node that does not have the heap property, you can give it the heap property by exchanging its value with the value of the larger child This is sometimes called sifting up Notice that the child may have lost the heap property 14 12 8 12 8 14 Blue node has heap property Blue node does not have heap property siftUp
Add a new node here Add a new node here Constructing a heap I • A tree consisting of a single node is automatically a heap • We construct a heap by adding nodes one at a time: • Add the node just to the right of the rightmost node in the deepest level • If the deepest level is full, start a new level • Examples:
Constructing a heap II • Each time we add a node, we may destroy the heap property of its parent node • To fix this, we sift up • But each time we sift up, the value of the topmost node in the sift may increase, and this may destroy the heap property of its parent node • We repeat the sifting up process, moving up in the tree, until either • We reach nodes whose values don’t need to be swapped (because the parent is still larger than both children), or • We reach the root
8 10 10 10 8 8 5 10 10 12 8 5 12 5 10 5 12 8 8 8 1 2 3 4
The node containing 8 is not affected because its parent gets larger, not smaller The node containing 5 is not affected because its parent gets larger, not smaller The node containing 8 is still not affected because, although its parent got smaller, its parent is still greater than it was originally 10 10 14 14 12 12 12 14 10 5 5 5 8 8 8 Other children are not affected
Here’s a sample binary tree after it has been heapified Notice that heapified does not mean sorted Heapifying does not change the shape of the binary tree; this binary tree is balanced and left-justified because it started out that way 25 22 17 19 22 14 15 18 14 21 3 9 11 A sample heap