1 / 14

An Algorithm for Streaming XPath Processing with Forward and Backward Axes

An Algorithm for Streaming XPath Processing with Forward and Backward Axes. Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J. Watson Research Center, Marcus Fontoura, Vanja Josifovski IBM Almaden Research Center Published at ICDE 2003 Presented by

glenna
Download Presentation

An Algorithm for Streaming XPath Processing with Forward and Backward Axes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Algorithm for Streaming XPath Processing with Forward and Backward Axes Charles Barton, Philippe Charles, Deepak Goyal, Mukund Raghavchari IBM T. J. Watson Research Center, Marcus Fontoura, Vanja Josifovski IBM Almaden Research Center Published at ICDE 2003 Presented by Amir Bar-or, Technion

  2. Overview • Background Information • Evolution of query processing • XML processing • Example Document • Used Concepts • X-tree • X-dag • XAOS • Algorithm Filtering Events • Building Matching-Structures • Emitting Output • Walk through • Experimental results

  3. Transactional • Low to medium update rate • Disk resident data • Transactional • Instant • Accurate • Static optimizations • Index Classical • Transactional • Low to medium update rate • Disk resident data • Transactional/Non transactional • Continuous • Accurate • Static optimizations • Index Publish subscribe The evolution of query processing Update model Query model

  4. Non - transactional • High update rate • Data is too big and cannot be stored efficiently on disks. • Non - Transactional • Continuous • Approximated • Dynamic optimizations • Limited Buffering Streaming The evolution of query processing Update model Query model The close relatives of streaming algorithms are the one-pass algorithms.

  5. XML processing XML parser • Dom approach • Build in-core representations • Process as needed by standard API • Disadvantages: • Scalability– cannot process large documents • Locality– multiple traversals • Algorithm inefficiencies– API’s perform unnecessary traversals • SAX approach • Use a streaming event base API for on the fly parsing of XML • Disadvantages: • Programmability : low level event handling • Lack of support for Xpath, (especially with parent/ ancestor axes) Build DOM tree Process DOM tree (XPath,XQuery,..)

  6. Caoz Aproach Caoz (chaos): an acronym for XML Analysis, Optimization,and Stuff. XPath Expression Results XML Doc XML Parser Filter Match Specialized XPath processor Parsing events: SAX,DOM,Custom

  7. Background Information • Restricted XPath Set: • loc path: / step • predicate: [ ] • nodetest • axis specifier: ancestor, parent, child, descendant

  8. Example document Nodename (id, level) Root(0,0) <X> <Y> <Z> <V/> <V/> <W> <W/> </ W> </ Z> <U/> </ Y> <Y> <Z> <W/> </ Z> </ Y> </ X> X(1,1) Y(2,2) Y(9,2) Z(3,3) U(8,3) Z(10,3) V(4,4) V(5,4) W(6,4) W(11,4) W(7,5)

  9. X-Tree /descendant:: Y[ child:: U]/ descendant:: W[ ancestor:: Z/ child:: V] XPath expression is transformed into a rooted tree, the X- tree • Vertices of a X- tree are called X- nodes • Nodetests in the expression are translated into X- nodes • Unique incoming edges. labeled with the specified axis • One X- node is marked as 'Output X- node' Root Root descendent Y child descendent U W ancestor Z child V

  10. X-Dag • X-Dag is generated from the X-tree by reformatting the reverse axis into forward axis: • Reverse direction • Ancestor  Descendant • Parent  Child • Handle Orphan nodes • Add descendent axe from Root to orphan nodes

  11. X-tree X-dag Root Root descendent descendent descendent Y Z Y child descendent descendent child child descendent U W U W V ancestor Z child V /descendent::Y[child::U]/descendent::W[ancestor::Z/child::V]

  12. Matching • A matching for an x-tree X is a partial mapping from the x-nodes to the elements of document D where • All mapped vertices satisfy the node test • The edge between two mapped vertices describes the relationship between the mapped elements in the document • A total matching exists if all the nodes of the x-tree are mapped. • It is easy to show that an element e is in the result of the evaluation of xpath expression iff there is a total matching for the corresponding x-tree. The same argument can be proven for an x-dag. • A total matching of an x-tree node v, is composed of total matching at each of the children of v. • This is not true for an x-dag node.

  13. X-tree X-dag Root Root descendent descendent descendent Y Z Y child descendent descendent child child descendent U W U W V ancestor Z child V <Y> <U/> <W/> <Z> <W/> <V/> </ Z> </ Y> <Y> <Z> <W> </W> <Q/> </ Z> </ Y> /descendent::Y[child::U]/descendent::W[ancestor::Z/child::V]

  14. Non - transactional • High update rate • Data is too big and cannot be stored efficiently on disks. • Non - Transactional • Continuous • Approximated • Dynamic optimizations • Limited Buffering Streaming XAOS properties Update model Query model

More Related