1 / 20

Space Efficient Data Structures for Dynamic Orthogonal Range Counting

Space Efficient Data Structures for Dynamic Orthogonal Range Counting. Meng He and J. Ian Munro University of Waterloo. Dynamic Orthogonal Range Counting. A fundamental geometric query problem Definitions Data sets : a set P of n points in the plane

kiaria
Download Presentation

Space Efficient Data Structures for Dynamic Orthogonal Range Counting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro University of Waterloo

  2. Dynamic Orthogonal Range Counting • A fundamental geometric query problem • Definitions • Data sets: a set P of n points in the plane • Query: given an axis-aligned query rectangle R, compute the number of points in P∩R • Update: insertion or deletion of a point • Applications • Geometric data processing (GIS, CAD) • Databases

  3. Example

  4. Classic Solutions and Our Result * For integer coordinates. • Matches the lower bound under the group modelPătraşcu (2007)

  5. Background: Succinct Data Structures • What are succinct data structures (Jacobson 1989) • Representing data structures using ideally information-theoretic minimum space • Supporting efficient navigational operations • Why succinct data structures • Large data sets in modern applications: textual, genomic, spatial or geometric • A novel and unusual way of using succinct data structures (this paper) • Matching the storage cost of standard data structures • Improving the time efficiency

  6. Dynamic Range Sum • Data • A 2D array A[1..r, 1..c] of numbers • Operations • range_sum(i1, j1, i2, j2): the sum of numbers in A[i1..i2, i2.. j2] • modify(i, j, δ): A[i, j] ← A[i, j] + δ • insert(j): insert a 0 between A[i, j-1] and A[i, j] for i = 1, 2, …, r. • delete(j): delete A[i, j] for fori = 1, 2, …, r. To perform this, A[i, j] must be 0 for all i. • Restrictions on r, c and δ and operations supported may apply.

  7. Dynamic Range Sum: An Example 8 2 9 5 4 9 0 7 3 1 1 5 3 10 -2 2 9 1 8 0 0 0 0 0 5 12 0 3 1 0 0 4 2 8 3 5 4 1 0 4 1 0 18 5 5 range_sum(2, 3, 3, 6) = 25 insert(6) modify(2, 6, 5) modify(2, 6, -5) delete(6) range_sum(2, 3, 3, 7) = 30

  8. Dynamic Range Sum in a small 2D Array • Assumptions and restrictions • Word size w: Ω(lg n) • Each number: nonnegative, O(lg n) bits • rc = O(lgλ n) , 0 < λ < 1 • modify(i, j, δ): |δ| ≤ lgn • insert and delete: no support • Our solution • Space: O(lg1+λ n) bits, with an o(n)-bit universal table • Time: modify and range_sum in O(1) time • Generalization of the 1D array version (Raman et al. 2001) • Deamortization is interesting

  9. Range Sum in a Narrow 2D Array • Assumptions and restrictions • b = O(w): number of bits required to encode each number • “Narrow”: r = O(lgγc), 0 < λ < 1 • |δ| ≤ lgc • Our results • Space: O(rcb + w) bits, with an O(clgc)-bit buffer • Operations: O(lgc / lglgc) time • A generalization of the solution to CSPSI problem based on B trees (He and Munro 2010), using our small 2D array structure on each B-tree node

  10. Range Counting in Dynamic Integer Sequences • Notation • Integer range: [1..σ] • Sequence: S[1..n] • Operations: • access(x): S[x] • rank(α, x): number of occurrences of α in S[1..x] • select(α, r): position of the rth occurrence of α in S • range_count(p1, p2, v1, v2): number of entries in S[p1.. p2] whose values are in the range [v1.. v2]. • insert(α, i): insert α between S[i-1] and S[i] • delete(i): delete S[i] from S

  11. Range Counting in Integer Sequences: An Example S = 5,5,2,5,3,1,3,4,7,6,4,1,2,2,5,8 rank(5, 8) = 3 select(2, 3) = 14 range_count(6, 12, 2, 6) = 4

  12. Range Counting in Sequences of Small Integers • Restrictions • σ = O(lgρn) for any constant 0 < ρ < 1 • Our result • Space: nH0 + o(nlgσ) + O(w) bits • Time: O(lg n / lglg n) • This is achieved by combining: • Our solution to range sum on narrow 2D arrays • A succinct dynamic string representation (He and Munro 2010)

  13. Dynamic Range Counting: An Augmented Red Black Tree • Tx: A red black tree storing all the x-coordinates • Each node also stores the number of its descendants • Purpose: conversions between real x-coordinates and rank space in O(lg n) time

  14. Dynamic Range Counting: A Range Tree • Ty: A weight balanced B-tree (Arge and Vitter 2003) constructed over all the y-coordinates • Branching factor d = Θ(lgεn) for constant 0 < ε< 1 • Leaf parameter: 1 • The levels are numbered 0, 1, … from top to bottom • Essentially a range tree • Each node represents a range of y-coordinates • Choice of weight balanced B-tree: amortizing a rebuilding cost

  15. Dynamic Range Counting: A Wavelet Tree • Ideas from generalized wavelet trees (Ferragina et al. 2006) • For each node v of Ty, construct a sequence Sv: • Each entry of Sv corresponds to a point whose y-coordinate is in the range represented by node v • Sv [i] corresponds to the point with the ith smallest x-coordinate among all these points • Sv [i] indicates which child of v contains the y-coordinate of the above point • For each level m, construct a sequence Lm[1..n] of integers from [1..4d] by concatenating the all the Sv’s constructed at level m • Lm : stored as dynamic sequences of small integers • Space: O(n lg d + w) bits per level, O(n) words overall

  16. Range Counting Queries • Query range: [x1..x2] × [y1..y2] • Use Tx to convert the query x-range to a range in rank space • Perform a top-down traversal to locate the (up to two) leaves in Ty whose ranges contain y1and y2 • Perform range_count on Sv for each node v visited in the above traversal • Sum up the query results to get the answer • Time: O(lg n / lglg n) per level, O(lg n / lglg n) levels

  17. Insertions and Deletions • More complicated: splits and merges; changes to child ranks • The choice of storing Ty as weight balanced B-tree allows us to amortize the updating cost of subsequences of Lm’s • Additional techniques supporting batch updating of integer sequences are also developed

  18. Our Results • Dynamic Orthogonal Range Counting • Space: O(n) words • Time: O((lg n / lglg n)2) • Points on a U×U grid • Space: O(n) words • Time (worst-case): O(lgn lgU / (lglg n)2) • Succinct representations of dynamic integer sequences • Space: nH0 + o(nlgσ) + O(w) bits • Time (including range_count): lgσ O(──── ( ──── + 1)) lglg n lgn lglg n

  19. Conclusions • Results • The best result for dynamic orthogonal range counting • Same problem for points on a grid • The first succinct representations of dynamic integer sequences supporting range counting • Two preliminary results on dynamic range sum • Techniques • The first that combines wavelet trees with range trees • Deamortization on 2D arrays • Future work • Lower bound • Use techniques from succinct data structures to improve standard data structures

  20. Thank you!

More Related