1 / 49

CSE 326: Data Structures Trees

CSE 326: Data Structures Trees. Lecture 8: Friday, Jan 24, 2003. Today: Splay Trees. Fast both in worst-case amortized analysis and in practice Are used in the kernel of NT for keep track of process information! Invented by Sleator and Tarjan (1985) Details:

fjodi
Download Presentation

CSE 326: Data Structures Trees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 326: Data Structures Trees Lecture 8: Friday, Jan 24, 2003

  2. Today: Splay Trees • Fast both in worst-caseamortized analysis and in practice • Are used in the kernel of NT for keep track of process information! • Invented by Sleator and Tarjan (1985) • Details: • Weiss 4.5 (basic splay trees) • 11.5 (amortized analysis) • 12.1 (better “top down” implementation)

  3. Basic Idea “Blind” rebalancing – no height info kept! • Worst-case time per operation is O(n) • Worst-case amortized time is O(log n) • Insert/find always rotates node to the root! • Good locality: • Most commonly accessed keys move high in tree – become easier and easier to find

  4. Since you’re down there anyway, fix up a lot of deep nodes! Idea move n to root by series of zig-zag and zig-zig rotations, followed by a final single rotation (zig) if necessary 10 You’re forced to make a really deep access: 17 5 2 9 3

  5. Helped Unchanged Hurt Zig-Zag* g n up 2 X p g p down 1 down 1 up 1 n W X Y Z W Y Z *This is just a double rotation

  6. Zig-Zig g n W p p Z X n g Y Y Z W X

  7. Why Splaying Helps • Node n and its children are always helped (raised) • Except for last step, nodes that are hurt by a zig-zag or zig-zig are later helped by a rotation higher up the tree! • Result: • shallow nodes may increase depth by one or two • helped nodes decrease depth by a large amount • If a node n on the access path is at depth d before the splay, it’s at about depth d/2 after the splay • Exceptions are the root, the child of the root, and the node splayed

  8. 6 5 4 Splaying Example 1 1 2 2 zig-zig 3 3 Find(6) 4 5 6

  9. 6 5 4 Still Splaying 6 1 1 2 6 zig-zig 3 3 2 5 4

  10. 6 1 Almost There, Stay on Target 1 6 zig 3 3 2 5 2 5 4 4

  11. 6 6 1 1 Splay Again zig-zag 3 4 Find(4) 2 5 3 5 4 2

  12. 6 1 Example Splayed Out 4 1 6 zig-zag 3 5 4 2 3 5 2

  13. Locality • “Locality” – if an item is accessed, it is likely to be accessed again soon • Why? • Assume mn access in a tree of size n • Total worst case time is O(m log n) • O(log n) per access amortized time • Suppose only k distinct items are accessed in the m accesses. • Time is O(n log n + m logk ) • Compare with O( m log n ) for AVL tree those k items are all at the top of the tree getting those k items near root

  14. Splay Operations: Insert • To insert, could do an ordinary BST insert • but would not fix up tree • A BST insert followed by a find (splay)? • Better idea: do the splay before the insert! • How?

  15. Split Split(T, x) creates two BST’s L and R: • All elements of T are in either L or R • All elements in L are  x • All elements in R are  x • L and R share no elements Then how do we do the insert?

  16. Split Split(T, x) creates two BST’s L and R: • All elements of T are in either L or R • All elements in L are  x • All elements in R are > x • L and R share no elements Then how do we do the insert? Insert as root, with children L and R

  17. Splitting in Splay Trees • How can we split? • We have the splay operation • We can find x or the parent of where x would be if we were to insert it as an ordinary BST • We can splay x or the parent to the root • Then break one of the links from the root to a child

  18. could be x, or what would have been the parent of x Split split(x) splay T L R if root is  x if root is > x OR L R L R •  x > x < x • > x

  19. split(x) L R Back to Insert Insert(x): Split on x Join subtrees using x as root x L R  x > x

  20. Insert(5) Insert Example 6 4 4 6 1 9 split(5) 1 6 1 9 9 4 7 2 2 7 7 2 5 4 6 1 9 2 7

  21. find(x) L R Splay Operations: Delete x delete x L R < x > x Now what?

  22. splay L R R Join • Join(L, R): given two trees such that L < R, merge them • Splay on the maximum element in L then attach R L

  23. find(x) L R Delete Completed x T delete x L R < x > x Join(L,R) T - x

  24. Delete(4) Delete Example 6 4 6 1 9 find(4) 1 6 1 9 9 4 7 2 2 7 Find max 7 2 2 2 1 6 1 6 9 9 7 7

  25. Splay Trees, Summary • Splay trees are arguably the most practical kind of self-balancing trees • If number of finds is much larger than n, then locality is crucial! • Example: word-counting • Also supports efficient Split and Join operations – useful for other tasks • E.g., range queries

  26. Dictionary & Search ADTs • Dictionary ADT (aka map ADT) Stores values associated with user-specified keys • keys may be any (homogenous) comparable type • values may be any (homogenous) type • Search ADT: (aka Set ADT)stores keys only

  27. Dictionary & Search ADTs create :  dictionary insert : dictionary  key  values dictionary find : dictionary  key values delete : dictionary  key dictionary insert(kohlrabi, upscale tuber) find(kreplach) kreplach:tasty stuffed dough

  28. Dictionary Implementations • Arrays: • Unsorted • Sorted • Linked lists • BST • Random • AVL • Splay

  29. Dictionary Implementations

  30. The last dictionary we discuss:B-Trees • Suppose we want to store the data on disk • A disk access is a lot more expensive than one CPU operation • Example • 1,000,000 entries in the dictionary • An AVL tree requires log(1,000,000)  20 disk accesses – this is expensive • Idea in B Trees: • Increase the fan-out, decrease the hight • Make 1 node = 1 block

  31. B-Trees Basics • All keys are stored at leaves • Nonleaf nodes have guidance keys, to help the search • Parameter d = the degree book uses theorder M = 2d+1) • Rules for Keys: • The root is either a leaf, or has between 1 and 2d keys • All other nodes (except the root) have between d and 2d keys • Rule for number of children: • Each node (except leaves) has one more children than keys • Balance rule: • The tree is perfectly balanced !

  32. B-Trees Basics • A non-leaf node: • A leaf node: Keys k < 30 30<=k<120 120<=k<240 Keys 240<=k Then called a B+ tree Next leaf Record with key 40 Record with key 50 Record with key 60

  33. B+Tree Example d = 2 (M = 5) Find the key 40 40  80 20 < 40  60 30 < 40  40 10 15 18 20 30 40 50 60 65 80 85 90

  34. B+Tree Design • How large d ? • Example: • Key size = 4 bytes • Pointer size = 8 bytes • Block size = 4096 byes • 2d x 4 + (2d+1)  8 <= 4096 • d = 170

  35. B+ Trees Depth • Assume d = 170 • How deep is the B-tree ? • Depth = 0 (just the root)  at least 170 keys • Depth = 1  at least 170+170171  30103 keys • Depth = 2  170+170171+1701712  5106 keys • Depth = 3  170+...+1701713  860 106 keys • Depth = 4  170+...+1701714  147 109 keys Nobody has more keys ! With a B tree we can find any data item with at most 5 disk accesses !

  36. Insertion in a B+ Tree Insert (K, P) • Find leaf where K belongs, insert • If no overflow (2d keys or less), halt • If overflow (2d+1 keys), split node, insert in parent: • If leaf, keep K3 too in right node • When root splits, new root has 1 key only parent K3 parent

  37. Insertion in a B+ Tree Insert K=19 10 15 18 20 30 40 50 60 65 80 85 90

  38. Insertion in a B+ Tree After insertion 10 15 18 19 20 30 40 50 60 65 80 85 90

  39. Insertion in a B+ Tree Now insert 25 10 15 18 19 20 30 40 50 60 65 80 85 90

  40. Insertion in a B+ Tree After insertion 10 15 18 19 20 25 30 40 50 60 65 80 85 90

  41. Insertion in a B+ Tree But now have to split ! 10 15 18 19 20 25 30 40 50 60 65 80 85 90

  42. Insertion in a B+ Tree After the split 10 15 18 19 20 25 30 40 50 60 65 80 85 90

  43. Deletion from a B+ Tree Delete 30 10 15 18 19 20 25 30 40 50 60 65 80 85 90

  44. Deletion from a B+ Tree After deleting 30 May change to 40, or not 10 15 18 19 20 25 40 50 60 65 80 85 90

  45. Deletion from a B+ Tree Now delete 25 10 15 18 19 20 25 40 50 60 65 80 85 90

  46. Deletion from a B+ Tree After deleting 25 Need to rebalance Rotate 10 15 18 19 20 40 50 60 65 80 85 90

  47. Deletion from a B+ Tree Now delete 40 10 15 18 19 20 40 50 60 65 80 85 90

  48. Deletion from a B+ Tree After deleting 40 Rotation not possible Need to merge nodes 10 15 18 19 20 50 60 65 80 85 90

  49. Deletion from a B+ Tree Final tree 10 15 18 19 20 50 60 65 80 85 90

More Related