150 likes | 152 Views
Concurrent Search Structure Algorithms. Dennis Shasha. What is a Search Structure?. Data structure (typically a B tree, hash structure, R-tree, etc.) that supports a dictionary. Operations are insert key-value pair, delete key-value pair, and search for a key-value pair.
E N D
Concurrent Search Structure Algorithms Dennis Shasha
What is a Search Structure? • Data structure (typically a B tree, hash structure, R-tree, etc.) that supports a dictionary. • Operations are insert key-value pair, delete key-value pair, and search for a key-value pair.
How to make a search structure algorithm concurrent? • Naïve approach: use two phase locking (but then the root is at least read-locked so conflicts occur) • Semi-naïve: use hierarchical tree locking: lock root; afterwards lock node n (Still tends to hold locks high in tree.)
How can we do better? • Fundamental insight: In a search structure algorithm, all that we really care about is that we implement the dictionary operations correctly. • Operations on structure need not even be serializable provided they maintain certain constraints.
Train your intuition: parable of the library • Bob goes to a library (with books) and looks in the catalogue for a great puzzle book P. He is told it is on stack G. • He walks towards G but sees a friend and they have a chat. • Alice the librarian moves some books from G to H and leaves a note. She then changes the catalogue. • Bob goes to G, sees the note, and finds P on stack H.
Observe: • In no serial execution would Bob visit two stacks – if he had gone after Alice, he would have visited only H; if before, only G. • But this is still ok. • Why? • Intuition: the search is always pointed towards a correct final position.
Using this for search structures • KeySpace = the set of all possible keys (e.g. all possible integers) • Inset of a node n = subset of KeySpace that is either in n, a node reachable from n or nowhere in the data structure. • Outset of n towards n’ = The subset of KeySpace associated with the edge from n to n’. Depends on the data structure. • Keyset of n = inset(n) – U outset(n,n’)
Binary search tree • Root node is 20 • Left child is 5 • And maybe the left child has descendants. • What is the inset of the left child? All values less than 20. • Right child of 5 has value 13. What is the outset(5,13) and what is the inset(13)
Example • Suppose the root of a binary search tree has the value 20 and a left child L and right child R. • Inset(root) = KeySpace • Outset(root,L) = {x| x < 20} • Outset(root, R) = {x| x > 20} • Keyset(root) = {20}
Let’s Return to the Library • Suppose that, to start, the shelf G has as its inset all books with last name starting with “S”. • Alice moves those between “S” and “Si” to shelf H and leaves a note to that effect. • Then the keyset of G becomes {x | x begins with “S”} – {x|x <= “Si”} • The keyset of G represents the set of books that are in G or nowhere in the structure.
The Key Invariants • If x is in node n, then x is in keyset(n) • The keysets partition the KeySpace. • If the search for an item x is at node n, then x is in keyset(n) or there is a path from n to node m such that x is in keyset(m) and every edge along the path has x in its outset. • The invariants assure that the search, if it terminates, will terminate in the correct place.
Application: link algorithm • Recall splits in B trees. Split n into n and n’ and adjust the parent p to reflect the change. • Here is how to do a split: lock(n), move the values from n to n’ as desired and leave a forward pointer to that effect, unlock(n), lock(p), and adjust p, then unlock(p). • This is exactly the library scenario.
Application: give-up algorithm • Instead of including a forwarding pointer, just asset explicitly the inset of each node. • When splitting, reduce the asserted inset(n), denoted assertinset(n) and establish assertinset(n’). • If a search for x arrives at n and x is not in assertinset(n), the search starts over. • Happens rarely enough that performance is very good.
Application: multi-rooted structure • Imagine a link B-tree structure with a root at the top of the tree and a root to the left of the leftmost leaf node. • Same invariants hold and search can proceed from anywhere.
Conclusion • Simple framework for all search structures. • Handful of concepts: KeySpace, inset, outset, keyset. • Performs well.