670 likes | 796 Views
Algorithmic Aspects of Searching in the Past. (Lecture 13: Persistence and Oblivious Data Structures) Advanced Algorithms & Data Structures. Thomas Ottmann Institut für Informatik, Universität Freiburg, Germany ottmann@informatik.uni-freiburg.de. Overview.
E N D
Algorithmic Aspects of Searching in the Past (Lecture 13: Persistence and Oblivious Data Structures) Advanced Algorithms & Data Structures Thomas Ottmann Institut für Informatik, Universität Freiburg, Germany ottmann@informatik.uni-freiburg.de
Overview • Motivation: Oblivious and persistent structures • Examples: Arrays, search trees, Z-stratified search trees, relaxation • Making structures persistent: Structure-copying, path-copying-, DSST-method • Application: Pointlocation • Application: Time-evolving data: Capture and replay of whiteboard data, in particular handwriting traces • Oblivious structures: Randomized and uniquely represented structures, c-level jump lists
Motivation A structure storing a set of keys is called oblivious, if it is not possible to infer its generation history from its current shape. A structure is called persistent, if it supports access to multiple versions. Partially persistent: All versions can be accessed but only the newest version can be modified. Fully persistent: All versions can be accessed and modified. Confluently persistent: Two or more old versions can be combined into one new version.
Example: Arrays Array: 8 15 17 43 47 …… 2 4 Uniquely represented structure, hence, oblivious! Access: In time O(log n) by binary search. Update (Insertion, Deletion): W(n) Caution: Storage structure may still depend on generation history!
1 5 3 1 7 5 3 7 Example: Natural search trees Only partially oblivious! • Insertion history can sometimes be reconstructed. • Deleted keys are not visible. Access, insertion, deletion of keys may take time W(n) 1, 3, 5, 7 5, 1, 3, 7
Example: Balanced search tree 10 6 15 2 9 11 23 2 5 7 10 11 15 20 30 5 6 7 9 20 23 Problem: Updates come in sudden bursts (Example: Recording ink-traces from pen input) Not enough time to serialize insertions and rebalancing transformations Solution: Relaxed balancing: Carry out updates and rebalancing transformations concurrently!
. . . . Stratified search trees ….. … … … …
Insertion ….. x p . … … . . . ….. … … Insert the new key among the leaves at the expected position and deposit a „push-up-request“
1 2 3 4 3 4 1 2 Handling of push-up-requests (1) • A push-up-request either leads to a local structural change and halt, which can be carried out in time O(1) (Case 1) • or (exclusively) to a recursive shift of the push-up-requests to the next higher stratum without any structural change (Case 2) Case 1 [There is still room on the next higher stratum] 1 4 3 1 2 3 2 3 1 2 1 2 3 4
Handling of push-up-requests (2) Case 2 [Next higher stratum is full] 1 4 5 1 4 5 2 3 2 3 Append a new apex, if node is pushed over topmost stratum boarder
. . . . Deletion ….. … … … … … … Locate x among the leaves. Deposit a removal request at x. Handle removal request.
Case 1 [Enough nodes at bottommost stratum] Case 2 [Bottommost stratum too sparse] Handling removal requests Deposit „pull-down-request“ p q q
Handling of pull-down-requests (1) Case1 [There are enough nodes on next higher stratum] Finite structural change and Halt! p 1 2 3 p 1 2 3 1 2 p 3 4 p 1 2 3 4 p 1 2 3 4 1 2 p 3 4
Handling of pull-down-requests (2) Case 2 [Not enough nodes on next higher stratum] q q p p Recursively shift pull-down-request to next higher stratum, but no structural change!
Z-stratified search trees: Observations Insertions, deletions, and rebalancing-transformations (removal of , ) can be arbitrarily interleaved. The amortized restructuring costs per insertion or deletion are constant. The generation history of a current version may be partially reconstructed (Sequence of insertions and deletions are partially visible) But: • Update operations are always applied to the current version • Z-stratified search trees are not persistent
Overview • Motivation: Oblivious and persistent structures • Examples: Arrays, search trees, Z-stratified search trees, relaxation • Making structures persistent: Structure-copying, path-copying-, DSST-method • Application: Pointlocation • Application: Time-evolving data: Capture and replay of whiteboard data, in particular handwriting traces • Oblivious structures: Randomized and uniquely represented structures, c-level jump lists
Simple methods for making structures persistent • Copy structure and apply an update-operation to the copy, yields fully persistence at the price of W(n) time per update and space W(m n) for m updates applied to structures of size n. (Structure-copying method) • Do nothing, but store a log-file of all updates! In order to access version i, first carry out i updates, starting with the initial structure, and generate version i. (i) time per access, (m) space for m operations. • Hybrid-method: Store the complete sequence of updates and additionally each k-th version for a suitably chosen k. Result: Time and space requirement increases at least with a faktor sqr(m) ! Are there any better methods? …. for search trees….
Persistent search trees (1) Path-copying method 0 version 0: 5 7 1 3
Persistent search trees (1) Path-copying method 0 1 version 1: Insert (2) 5 5 1 7 1 3 3 2
Persistent search trees (1) Path-copying method 0 1 2 version 1: Insert (2) version 2: Insert (4) 5 5 5 1 1 7 1 3 3 3 2 4
Persistent search trees (1) Path-copying method 0 1 2 version 1: Insert (2) version 2: Insert (4) 5 5 5 1 1 7 1 3 3 3 2 4 Restructuring costs: O(log n) per update operation
k lp t: rp rp Persistent search trees (2) DSST-method: Extend each node by a time-stamped modification box • Modification boxes • initially empty • are filled bottom up ? All versions after time t All versions before time t
DSST method 5 1 7 3 version 0
DSST method 5 1 7 3 version 0: 1 lp 2
DSST method 5 1 7 3 version 1: Insert (2) version 2: Insert (4) 3 1 lp 2 4
DSST method 5 1 7 2 rp 3 version 1: Insert (2) version 2: Insert (4) 3 1 lp 2 4 The amortized costs (time and space) per update operation are O(1)
Overview • Motivation: Oblivious and persistent structures • Examples: Arrays, search trees, Z-stratified search trees, relaxation • Making structures persistent: Structure-copying, path-copying-, DSST-method • Application:Pointlocation • Application: Time-evolving data: Capture and replay of whiteboard data, in particular handwriting traces • Oblivious structures: Randomized and uniquely represented structures, c-level jump lists
Application: Planar Pointlocation Suppose that the Euclidian plane is subdivided into polygons by n line segments that intersect only at their endpoints. Given such a polygonal subdivision and an on-line sequence of query points in the plane, the planar point location problem, is to determine for each query point the polygon containing it. Measure an algorithm by three parameters: 1) The preprocessing time. 2) The space required for the data structure. 3) The time per query.
Solving planar point location (Cont.) Partition the plane into vertical slabs by drawing a vertical line through each endpoint. Within each slab the lines are totally ordered. Allocate a search tree per slab containing the lines at the leaves with each line associate the polygon above it. Allocate another search tree on the x-coordinates of the vertical lines
Solving planar point location (Cont.) To answer query first find the appropriate slab then search the slab to find the polygon
Planar point location -- analysis Query time is O(log n) How about the space ? (n2) And so could be the preprocessing time
Planar point location -- bad example Total # lines O(n), and number of lines in each slab is O(n).
Planar point location & persistence So how do we improve the space bound ? Key observation: The lists of the lines in adjacent slabs are very similar. Create the search tree for the first slab. Then obtain the next one by deleting the lines that end at the corresponding vertex and adding the lines that start at that vertex How many insertions/deletions are there alltogether ? 2n
Planar point location & persistence (cont) Updates should be persistent since we need all search trees at the end. Partial persistence is enough. Well, we already have the path copying method, lets use it. What do we get ? O(n logn) space and O(n log n) preprocessing time. We can improve the space bound to O(n) by using the DSST method.
Overview • Motivation: Oblivious and persistent structures • Examples: Arrays, search trees, Z-stratified search trees, relaxation • Making structures persistent: Structure-copying, path-copying-, DSST-method • Application: Pointlocation • Application: Time-evolving data: Capture and replay of whiteboard data, in particular handwriting traces • Oblivious structures: Randomized and uniquely represented structures, c-level jump lists
Lightweight content creation Time evolving data: Presentation recording • Input media • Whiteboard • TouchScreen • Tablet PC Data sources Author Audience Recorded learning module Document
Pen input, large display Eye contact with audience Cintiq Tablet (Wacom)
Random access facility Access of an ink-object sj corresponding to time tj requires the immediate presentation of sj and of all ink-objects since t0
Whiteboard data Whiteboard data-stream requires • Fast insertion and deletion ofgraphical objects (lines, circles, pen-traces, …) in large quantities, • Partially persistent storage which allows: • Fast access (display and „rendering“) of all data for a given time stamp, • Synchronisability (as slave) with audio-stream (master). Problem: Find a suitable method for storing the whiteboard-action stream!
Postprocessing Whiteboard-stream is made persistent by the structure-copying method: For each time stamp t a complete list of all objects visible on the board at time t is (pre-)computed and stored for random access. Disadvantage: Highly redundant, very large data-volume Advantage: Visible scrolling Storage and representation of freehand ink-traces: Find a suitable compromise between conflicting goals: Data-volume Access cost (time) and dynamic replay (visible scrolling) Individual, personal style Skalability (vector- vs. raster-based-representation)
Overview • Motivation: Oblivious and persistent structures • Examples: Arrays, search trees, Z-stratified search trees, relaxation • Making structures persistent: Structure-copying, path-copying-, DSST-method • Application: Pointlocation • Application: Time-evolving data: Capture and replay of whiteboard data, in particular handwriting traces • Oblivious structures:Randomized and uniquely represented structures, c-level jump lists
Methods for making structures oblivious Unique representation of the structure: • Set/size uniqueness: For each set of n keys there is exactly one structure which can store such a set. • The storage is order unique, i.e. the nodes of the strucure are ordered and the keys are stored in ascending order in nodes with ascending numbers. Randomise the structure: Assure that the expectation for the occurrence of a structure storing a set M of keys is independent of the way how M was generated. Observation: The address-assingment of pointers has to be subject under a randomised regime!
Example of a randomised structure Z-stratified search tree ….. . … … On each stratum, randomly choose the distribution of trees from Z. . . . ….. … … Insertion? Deletion?
5 1 1 7 3 5 3 7 3 1 5 7 Uniquely represented structures (a) Generation history determines structure 5, 1, 3, 7 1, 3, 5, 7 (b) Set-uniqueness:Set determines structure 1, 3, 5, 7