1 / 23

BST Data Structure

BST Data Structure. A BST node contains: A key (used to search) The data associated with that key Pointers to children, parent Leaf nodes have NULL pointers for children A BST contains A pointer to the root of the tree. BST Operations: Insert. BST property must be maintained

oralee
Download Presentation

BST Data Structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BST Data Structure • A BST node contains: • A key (used to search) • The data associated with that key • Pointers to children, parent • Leaf nodes have NULL pointers for children • A BST contains • A pointer to the root of the tree.

  2. BST Operations: Insert • BST property must be maintained • Algorithm sketch: • To insert data with key k • Compare k to root.key • If k < root.key, go left • If k > root.key, go right • Repeat until you reach a leaf. That's where the new node should be inserted. • Note: keep track of prospective parent along the way.

  3. BST Operations: Insert • Running time: • The new node is inserted at a leaf position, so this depends on the height of the tree. • Worst case: • Inserting keys 1,2,3,... in this order will result in a tree that looks like a chain: • Tree has degenerated to list • Height : linear • Note also that such a tree is worsethan a linked list since it takes upmore space (more pointers) 1 2 3

  4. BST Operations: Insert • Running time: • The new node is inserted at a leaf position, so this depends on the height of the tree. • Best case • The top levels of the tree are filled up completely • The height is then lognwhere n is the numberof nodes in the tree. 12 4 14 2 8 16

  5. BST Operations: Insert • The height of a complete (i.e. all levels filled up) BST with n nodes is logarithmic. Why? • Level i has 2i nodes, for i=0 (top level) through h (=height) • The total number of nodes, n, is then:n = 20+21+...+2h = (2h+1-1)/(2-1) = 2h+1-1Solving for h gives us h  logn

  6. BST Operations: Insert • Analysis conclusion • An insert operation consists of two parts: • Search for the position • best case logarithmic • worst case linear • Physically insert the node • constant

  7. BST Operations: Insert • What if we allow duplicate keys? • Idea #1 : Always insert in the right subtree • Results in very unbalanced tree • Idea #2 : Insert in alternate subtrees • Makes it difficult to search for all occurrences • Idea #3 : All elements with the same key are inserted in a single node • Good idea! • Easy to search, does not affect balance any more than non-duplicate insertion.

  8. BST Operations: Insert • What if we allow variable number of children? (n-ary tree) • Idea : Use a vector/list of pointers to children.

  9. BST Operations: Search • Take advantage of the BST property. • Algorithm sketch: • Compare target to root • If equal, return success • If target < root, search left • If target > root, search right • Running time: • Similar to insert

  10. BST Operations: Delete • The Delete operation consists of two parts: • Search for the node to be deleted • best case constant (deleting the root) • worst case linear • Delete the node • best case? • worst case?

  11. BST Operations: Delete • CASE #1 • The node to be deleted is a leaf node. • Easy! • Physically remove the node. • Constant time • We are just resetting its parent's child pointer and deallocating memory

  12. BST Operations: Delete • CASE #2 • The node to be deleted has exactly one child • Easy! • Physically remove the node. • Constant time • We are just resetting its parent's child pointer, its child's parent pointer and deallocating memory

  13. BST Operations: Delete • CASE #3 • The node to be deleted has two children • Not so easy • If we physically delete the node, we'll have to place its two children somewhere. This seems to require too much tree restructuring. • But we know it's easy to delete a node that has at most one child. What if we find such a node whose contents can be copied over without violating the BST property and then physically delete that node?

  14. BST Operations: Delete • CASE #3, continued • The node to be deleted, x, has two children • Idea: • Find the x's immediate successor, y. It is guaranteed to have at most one child • Copy the y's contents over to x • Physically delete y.

  15. BST Operations: Delete • Finding the immediate successor: • We know that the node has two children. Due to the BST property, the immediate successor will be in the right subtree. • In particular, the immediate successor will be the smallest element in the right subtree. • The smallest element in a BST is always the leftmost leaf.

  16. BST Operations: Delete • Finding the immediate successor: • Since it requires traveling down the tree from the current node to a leaf, it may take up to linear time in the worst case. • In the best case it will take logarithmic time. • The time to perform the copy and delete the successor is constant.

  17. Binary Search Trees • Traversing a tree = visiting its nodes • Three major ways to traverse a binary tree: • preorder • visit root • visit left subtree • visit right subtree • postorder • visit left subtree • visit right subtree • visit root When applied on a BST, it visits the nodes in order from smaller to larger • inorder • visit left subtree • visit root • visit right subtree

  18. Binary Search Trees void print_inorder(Node *subroot ) { if (subroot != NULL) { print_inorder(subroot  left); cout << subrootdata; print_inorder(subroot right); } } How long does this take? There is exactly one call to print_inorder() for each node of the tree. There are n nodes, so the running time of this operation is(n)

  19. Binary Search Trees • A tree may also be traversed one "level" at a time (top to bottom, left to right). This is usually called a level-order traversal. • It requires the use of a temporary queue: enqueue root while (queue is not empty) { get the front element, f print f enqueue f's children dequeue }

  20. Binary Search Trees 12 4 14 2 8 16 6 10 in-order : 2 - 4 - 6 - 8 - 10 - 12 - 14 pre-order: 12 - 4 - 2 - 8 - 6 - 10 - 14 - 16 post-order: 2 - 6 - 10 - 8 - 4 - 16 - 14 - 12 level-order: 12 - 4 - 14 - 2 - 8 - 16 - 6 - 10

  21. Binary Search Trees • Idea for sorting algorithm: • Given a sequence of integers, insert each one in a BST • Perform an inorder traversal. The elements will be accessed in sorted order. • Running time: • In the worst case, the tree will degenerate to a list. Creation will take quadratic time and traversal will be linear. Total: O(n2) • On average, the tree will be mostly balanced. Creation will take O(nlogn) and traversal will again be linear. Total: O(nlogn)

  22. BSTs vs. Lists • Time • In the worst case, all dictionary operations are linear. • On average, BSTs are expected to do better. • Space • BSTs store an additional pointer per node. • The BST seemed like a good idea, but in the end it doesn't offer much improvement. • We must find a way to keep the tree balanced and guarantee logarithmic height.

  23. Balanced Trees • There are several ways to define balance • Examples: • Force the subtrees of each node to have almost equal heights • Place upper and lower bounds on the heights of the subtrees of each node. • Force the subtrees of each node to have similar sizes (=number of nodes)

More Related