1 / 31

COSC 6114

COSC 6114. Prof. Andy Mirzaian. Orthogonal Range Searching. References :. [M. de Berge et al] chapter 5 Applications : Data Base GIS, Graphics: crop-&-zoom, windowing. Orthogonal Range Search: Data Base Query. salary. 14,000. Mr. G. O. Meter born: Nov. 6, 1988 Salary: $13,600.

cicada
Download Presentation

COSC 6114

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 6114 Prof. Andy Mirzaian Orthogonal Range Searching

  2. References: • [M. de Berge et al] chapter 5 • Applications: • Data Base • GIS, Graphics: crop-&-zoom, windowing

  3. Orthogonal Range Search:Data Base Query salary 14,000 Mr. G. O. Meterborn: Nov. 6, 1988Salary: $13,600. 13,000 date of birth 1980,00,00 1989,99,99 2D Query Rectangle [1980,00,00 : 1989,99,99]  [13,000 : 14,000] salary 14,000 rank 13,000 4 2 date of birth 1980,00,00 1989,99,99 3D Query Orthogonal Range [1980,00,00 : 1989,99,99]  [13,000 : 14,000]  [2 : 4]

  4. 1D-Tree: 1-Dimensional Range Searching x axis x x’ Static: Binary Search in a sorted array. Dynamic: Store data points in some balanced Binary Search Tree T. Let the data points be P = { p1, p2 , …, pn }  . T is a balanced BST where the data appear at its leaves sorted left to right. The internal nodes are used to split left & right subtrees. Assume x(v) = max x(L), where L is any leaf in the left subtree of internal node v. root[T] 23 9 49 5 17 37 62 73 31 41 62 2 6 13 23 85 31 37 41 49 73 2 5 6 9 13 17 85 91 Query Range: [7 : 49]

  5. Query Range [x : x’]: Call 1DRangeQuery(root[T ],x,x’) ALGORITHM 1DRangeQuery (v, x, x’)if v is a leaf then if x  x(v)  x’ then report data stored at v else do if x  x(v) then 1DRangeQuery ( leftchild(v) , x, x’ ) if x(v) < x’ then 1DRangeQuery ( rightchild(v), x, x’ ) od end Complexities: Query Time O( K + log n) T ,[x,x’]  output Construction Time O(n log n) P T Space O(n) store T [These are optimal] root[T ] vsplit K leaves reported

  6. 2D-Tree y Consider dimension d=2: point p=( x(p) , y(p) ) , range R = [x1 : x2]  [y1 : y2]p R  x(p)  [x1 : x2] and y(p)  [y1 : y2] . R y2 y(p) p y1 x x1x(p) x2 L L 2D-tree Pleft Pright Pright Pleft OR L = vertical/horizontal median split. Alternate between vertical & horizontal splitting at even and odd depths. (Assume: no 2 points have equal x or y coordinates.) Pright L Pleft

  7. Constructing 2D-Tree Input: P = { p1, p2 , …, pn }  2 off-line. Output: 2D-tree storing P. Step 1: Pre-sort P on x & on y, i.e., 2 sorted lists Û = (Xsorted(P), Ysorted(P)). Step 2: root[T ]  Build2DTree ( Û , 0) end Procedure Build2DTree ( Û , depth ) if Û contains one point then return a leaf storing this point else do if depth is even then x-median split Û, i.e., split data points in half by a vertical line L through x-median of Û and reconfigure Ûleft and Ûright . else y-median split Û, … by a horizontal lineL, and reconfigure Ûleft and Ûright . v  a newly created node storing line L leftchild(v)  Build2DTree ( Ûleft, 1+depth) rightchild(v)  Build2DTree ( Ûright, 1+depth) return v end T(n) = 2 T(n/2) + O(n) = O(n log n) time.

  8. 2D-Tree Example L1 L9 L5 p10 p2 p8 p5 p6 p3 L6 L2 L4 p9 p7 p1 p4 L1 L8 L3 L7 L2 L6 L3 L5 L7 L9 L4 p4 p2 p5 L8 p9 p8 p10 p1 p3 p7 p6

  9. Query Point Search in 2D-Tree L1 L9 L5 p10 p2 p8 p5 p6 p3 L6 L2 L4 p9 p7 p1 p4 L1 L8 q L3 L7 L2 L6 L3 L5 L7 L9 L4 p4 p2 p5 L8 p9 p8 p10 p1 p3 p7 p6

  10. 2D-Tree node regions region(v) = rectangular region (possibly unbounded) covered by the subtree rooted at v. region (root[T ]) = (- : +  )  (- : +  ) Suppose region(v) =  x1 : x2  y1 : y2  what are region(leftchild(v)) and region(rightchild(v))?With x-split: region(lc(v)) =  x1 : x(L)] y1 : y2  region(rc(v)) = ( x(L) : x2  y1 : y2  With y-split: region(lc(v)) =  x1 : x2  y1 : y(L) ]region(rc(v)) =  x1 : x2 ( y(L) : y2  L lc(v) rc(v) rc(v) L lc(v)

  11. 2D-Tree Range Search For range R = [x1 : x2]  [y1 : y2] call Search2DTree (root[T ] , R ) • ALGORITHM Search2DTree ( v , R ) • if v is a leaf then if p(v)  R then report p(v) • else if region(lc(v))  R • thenReportSubtree (lc(v)) • else if region(lc(v))  R   • then Search2DTree ( lc(v) , R ) • if region(rc(v))  R • thenReportSubtree (rc(v)) • else if region(rc(v))  R   • then Search2DTree ( rc(v) , R ) • end • region(v) can either be passed as input parameter, or explicitly stored at node v, vT. • ReportSubtree(v) is a simple linear-time in-order traversal that reports every leaf descendent of node v.

  12. Running Time of Search2DTree • K = # of points reported. • Lines 3 & 7 take O(K) time over all recursive calls. • Total # nodes visited (reported or not) is proportional to # times conditions of lines 4 & 8 are true. • region(v)R & region(v)  R  a bounding edge e of R intersects region(v). • R has  4 bounding edges. Let e (assume vertical) be one of them. • Define H(n) (resp. V(n)) = worst-case number of nodes v that intersect e for a 2D-tree of n leaves, assuming root corresponds to an x-split (resp. y-split). e e H(n) L V(n) L

  13. dD-Tree Complexities 2D-Tree • Query Time : O( K + n ) worst-case, O( K + log n) average • Construction Time : O(n log n) • Storage Space: O(n) • dD-Tree d-dimensionsUse round-robin splitting at successive levels on the d dimensions x1 , x2 , … , xd . • Query Time: O(dK + d n1–1/d ) • Construction Time: O(d n log n) • Space: O(dn) How can we improve the query time?

  14. Range Trees 2D Range Tree • Query Time: O( K + log2n )O(K + log n) by Fractional Cascading • Construction Time: O(n log n) • Space: O(n log n) Range R = [x : x’]  [y : y’] 1D Range Tree on x-coordinates: y O(log n) x x’ x x’ O(log n) canonical sub-trees Each x-range [x : x’] can be expressed as the disjoint union of O(log n) canonical x-ranges.

  15. Range Trees 2-level data structure: root[T ] Primary Level:BST onx-coordinates Tassoc(v) v Secondary level: BST on y-coord. P(v) P(v) max(v) min(v)

  16. Range Tree Construction • ALGORITHM Build 2D Range Tree (P) • Input: P = { p1, p2 , …, pn }  2, P = (Px , Py) represented by pre-sorted list on x (named Px) and on y (named Py). • Output: pointer to the root of 2D range tree for P. • Construct Tassoc , bottom up, based on Py ,but store in each leaf the points, not just their y-coordinates. • if |P| > 1 then do • Pleft  { pP | px  xmed of P } (* both lists Px and Pyshould split *) • Pright  { pP | px > xmed of P } • lc(v)  Build 2D Range Tree (Pleft ) • rc(v)  Build 2D Range Tree (Pright ) • od • min(v)  min (Px ); max(v)  max(Px ) • Tassoc(v)  Tassoc • return v • end T(n) = 2 T(n/2) + O(n) = O(n log n) time. This includes time for pre-sorting.

  17. 2D Range Query • ALGORITHM 2DRangeQuery ( v, [x : x’]  [y : y’] ) • if x  min(v) & max(v)  x’ • then 1DRangeQuery (Tassoc(v) , [y : y’] ) • else if v is not a leaf do • if x  max(lc(v)) • then 2DRangeQuery ( lc(v), [x : x’]  [y : y’] ) • if min(rc(v)) x’ • then 2DRangeQuery ( rc(v), [x : x’]  [y : y’] ) • od • end T x x’ • Line 2 called at roots of red canonical sub-trees, a total of O(log n) times. Each call takes O(Kv + log | Tassoc(v) | ) = O(Kv + log n) time. • Lines 5 & 7 called at blue shoulder paths. Total cost O(log n). • Total Query Time = O(log n + v(Kv + log n)) = O(vKv + log2 n) = O(K + log2 n). Query Time: O( K + log2n ) will be improved to O(K + log n) by Fractional Cascading Construction Time: O(n log n) Space: O(n log n)

  18. Higher Dimensional Range Trees P = { p1, p2 , …, pn }  d, pi = (xi1 , xi2 , … , xid ) , i=1..n. root[T ] Primary Level:BST on the 1stcoordinate Tassoc(v) (d-1)-dimensional Range Tree on coord’s 2..d. v P(v) P(v)

  19. Higher Dimensional Range Trees d-level data structure

  20. Higher Dimensional Range Trees Query Time: Qd(n) = O( K + logdn)improved to O(K + logd-1n) by Frac. Casc. Construction Time: Td(n) = O(n logd-1 n) Space: Sd(n) = O(n logd-1 n)

  21. General Sets of Points What if 2 points have the same coordinate value at some coordinate axis? • Composite Numer Space: (lexicographic order) (a,b)  (a | b) (a | b) < (a’ | b’)  a<a’ or (a=a’ & b<b’) • p = (px , py )  p’ = ((px | py ) , (py | px ) ) R=[x:x’][y:y’]  R’ = [ (x | -) : (x’ | +) ]  [ (y | -) : (y’ | +) ] • p  R  p’  R’ x  px x’ (x | -) ((px | py )  (x’ | +) & y  py y’ & (y | -) ((py | px )  (y’ | +) • Note: no two points in the composite space have the same value at any coordinate (unless they are identical points).

  22. Fractional Cascading IDEA: Save repeated cost of binary search in many sorted lists for the same range [y : y’] if the list contents for one are a subset of the other. • A2 A1 • Binary search for y in A1 to get to A1[i]. • Follow pointer to A2 to get to A2[j]. • Now walk to the right in each list. A1 1 3 5 7 9 13 15 19 23 26 31 36 45 63 92 A2 5 13 26 36 45 nil nil

  23. Fractional Cascading A1 1 3 5 7 9 13 15 19 23 26 31 36 nil 1 7 13 23 26 36 3 5 9 15 19 31 A2 A3 • A2 A1 , A3 A1 . • No binary search in A2 and A3 is needed. • Do binary search in A1. • Follow blue and red pointers from there to A2 and A3. • Now we have the starting point in each sorted list. Walk to the right & report.

  24. Layered 2D Range Tree Tassoc(v) T v lc(v) Tassoc(lc(v)) Tassoc(rc(v)) rc(v) P(lc(v)) P(rc(v)) P(lc(v))  P(v)P(rc(v))  P(v) P(v)

  25. Layered 2D Range Tree Associated Structures at the secondary level byFractional Cascading T

  26. Layered 2D Range Tree (by Fractional Cascading) Query Time: Q2(n) = O(log n + v (Kv + log n)) = O(v Kv + log2 n) = O(K + log2 n) improves to: Q2(n) = O(log n + v (Kv + 1)) = O(v Kv + log n) = O(K + log n). For d-dimensional range tree query time improves to:

  27. Exercises

  28. Show the following implication on the worst-case query time on 2D-Tree: • Describe algorithms to insert and delete points from a 2D-Tree. You don’t need totake care of rebalancing the structure. • dD-Trees can also be used for partial match queries. A 2D partial match query specifies one of the coordinates and asks for all points that have the specifiedcoordinate value. In higher dimensions we specify values for a subset of thecoordinates. Here we allow multiple points to have equal values for coordinates.(a) Show that 2D-Trees can answer partial match queries in O(K+n) time, where K is the number of reported answers. (b) Describe a data structure that uses O(n) storage and answers 2D partial match queries in O(K + log n) time.(c) Show that a dD-Tree can solve a partial match query in O(K + n1-s/d) time, where s is the number of specified coordinates.(d) Show that, when we allow for O(d 2d n) storage, dD partial match queries can be answered in O(K + d log n) time.

  29. Describe algorithms to insert and delete points from a Range Tree. You don’t need to take care of rebalancing the structure. • One can use 2D-Trees and Range Trees to search for a particular point (a,b) by performing a range query with the range [a:a] [b:b].(a) Prove this takes O(log n) time on 2D-Trees.(b) Derive the time bound on Range Trees. • In many applications one wants to do range searching among objects other than points.(a) Let P be a set of n axis-parallel rectangles in the plane. We want to be able to report all rectangles in P that are completely contained in a query rectangle [x : x’]  [y: y’]. Describe a data structure for this problem that uses O(n log3 n) storage and has O(K + log4 n) query time, where K is the number of reported answers. [Hint: Transform the problem to an orthogonal range searching problem in some higher dimensional space.](b) Let P now consist of a set of n simple polygons in the plane. Describe a data structure that uses O(n log3 n) space (excluding space needed to externally • store the polygons) and has O(K + log4 n) query time, where K is the number of polygons completely contained in the query rectangle that are reported.(c) Improve the query time to O(K + log3 n).

  30. For this problem, assume for simplicity that n is a power of 2. Consider a 3D-Tree for a set of n points in 3D. Consider a line that is parallel to the x-axis. What is the maximum number of leaf cells that can intersect by such a line. • We showed that a 2D-tree could answer orthogonal range queries for a set of n points in the plane in O( n1/2 + K) time, where K was the number of points reported. Generalize this to show that in dimension 3, the query time is O(n2/3 + K).

  31. END

More Related