1 / 23

Piazza: Data Management Infrastructure for Semantic Web Applications

Piazza: Data Management Infrastructure for Semantic Web Applications. Alon Y. Halevy, Zachary G. Ives, Peter Mork, Igor Tatarinov. Speaker: Sergey Chernov Tutor: Jens Graupmann. Outline. INTRODUCTION. SEMANTIC WEB. PIAZZA: SYSTEM OVERVIEW IMPLEMENTATION DETAILS 3.1 MAPPING LANGUAGE

casper
Download Presentation

Piazza: Data Management Infrastructure for Semantic Web Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Piazza: Data Management Infrastructure for Semantic WebApplications Alon Y. Halevy, Zachary G. Ives,Peter Mork, Igor Tatarinov. Speaker: Sergey Chernov Tutor: Jens Graupmann Peer-to-Peer Information Systems – WS 03/04

  2. Outline • INTRODUCTION. SEMANTIC WEB. • PIAZZA: SYSTEM OVERVIEW • IMPLEMENTATION DETAILS 3.1 MAPPING LANGUAGE 3.2 QUERY ANSWERING ALGORITHM • CONCLUSIONS. Peer-to-Peer Information Systems – WS 03/04

  3. Introduction • Goal: • Data Integration and Knowledge Management • Problem: • Web data lacks machine-understandable semantics • Solution: • Semantic Web? Peer-to-Peer Information Systems – WS 03/04

  4. The Semantic Web* • Web sites include structural annotations • You can pose meaningful queries on them. • Ontologies provide the semantic glue. • Internal implementation of web sites left open. • Agents perform tasks: • Query one or more web sites • Perform updates (e.g., set schedules) • Coordinate actions • Trust each other (or not). • I.e., agents operating on a gigantic heterogeneous distributed database. (*View by A. Halevy) Peer-to-Peer Information Systems – WS 03/04

  5. General requirements • Robust infrastructure for querying • Peer data management systems. • Facilitate mapping between different structures. Need tools for: • Locating relevant structures • Easily joining the semantic web. • Get data into structured form • Should we worry about the legacy web? Peer-to-Peer Information Systems – WS 03/04

  6. Using views for specifyingmappings • Local-As-View (LAV). Data sources can be described as views over the mediated schema. • Global-As-View (GAV). Mediated schema can be described as a set of views over the data sources. Mediated Schema Site B Site C Site A Mediated Schema Site A Site B Site C Peer-to-Peer Information Systems – WS 03/04

  7. Mapping • Mapping AB specifies representation of structured data from scheme of node A into scheme of node B Mediated Schema Mapping “MS-C” Mapping “A-MS” Mapping “MS-A” Mapping “C-MS” Mapping “AB” Mapping “BC” Site B Site C Site A Mapping “BA” Mapping “CB” Peer-to-Peer Information Systems – WS 03/04

  8. Piazza: Peer Data-Management System • Goal: • Large scale autonomous sharing of structured data • Peer data management system (PDMS) • Autonomous Peers export data in their own schemas • Pair-wise mappings between peers • Generalization of a Data Integration system • NOT a P2P file sharing system Peer-to-Peer Information Systems – WS 03/04

  9. Relationship of PDMS to… • P2P overlay networks (the “Structured World”) • Data integration systems (no central logical mediated schema) • Federated databases (scale, ad-hoc nature) • Distributed databases (no central administration) Peer-to-Peer Information Systems – WS 03/04

  10. Representing Data • A spectrum of possibilities: • Relational tables, some integrity constraints • XML: can encode relational, hierarchical • Xquery – emerging standard query language (SQL for XML) • RDF: “XML on drugs”. • Sees only the logic; ignores other aspects. • DAML+OIL • Full-blown Knowledge representation language. • They all have semantics; just different expressive powers. • We keep the data simple. Mappings between data at different peers are more complex. Peer-to-Peer Information Systems – WS 03/04

  11. Area(areaID, name, descr) Project(projID, name, sponsor) ProjArea(projID, areaID) Pubs(pubID, projName, title, venue, year) Author(pubID, author) Member(projName, member) Members(memID, name) Projects(projID, name, startDate) ProjFaculty(projID, facID) ProjStudents(projID, studID) … Direction(dirID, name) Project(pID, dirID, name) … Project(projID, name, descr) Student(studID, name, status) Faculty(facID, name, rank, office) Advisor(facID, studID) ProjMember(projID, memberID) Paper(papID, title, forum, year) Author(authorID, paperID) Area(areaID, name, descr) Project(projID, areaID, name) Pub(pubID, title, venue, year) PubAuthor(pubID, authorID) PubProj(pubID, projID) Member(memID, projID, name, pos) Alumn(name, year, thesis) Peer Data Management • Mappings are query expressions • DbResearcher(x) Researcher(x),Area(x,DB) • DbResearcher(x), Office(x,DBLab) =DbLabMember(x) DB Projects MIT UW Stanford UCB Peer-to-Peer Information Systems – WS 03/04

  12. Piazza mapping language (1) • XML/XML Example • <pubs> • <book> • {: $a IN document(“source.xml”)\ • /authors/author • $t IN $a/publication/title, • $typ IN $a/publication/pub-type • WHERE $typ = “book” : } • <title> { $t }</title> • <author> • <name> {: $a/full-name :} </name> • </author> • </book> • </pubs> Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type Peer-to-Peer Information Systems – WS 03/04

  13. Piazza mapping language (2) • piazza:id attribute • <pubs> • <book piazza:id={$t}> • {: $a IN document(“source.xml”)\ • /authors/author • $t IN $a/publication/title, • $typ IN $a/publication/pub-type • WHERE $typ = “book” : } • <title piazza:id={$t}> { $t }</title> • <author piazza:id={$t}> • <name> {: $a/full-name :} </name> • </author> • </book> • </pubs> Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type Peer-to-Peer Information Systems – WS 03/04

  14. Piazza mapping language (3) • Partial mapping • <pubs> • <book piazza:id={$t}> • {: $a IN document(“source.xml”)\ • /authors/author • $t IN $a/publication/title, • $typ IN $a/publication/pub-type • WHERE $typ = “book” : } • PROPERTY $t >=’A’ AND $t < ‘B’ • : } • [: <publisher> • <name> • {: PROPERTY $this IN • {“PrintersInc”, “PubsInc”} :} • </name> • </publisher> :] • </book> • </pubs> Target: pubs book* title author* name publisher* name Source: authors author* full-name publication* title pub-type Peer-to-Peer Information Systems – WS 03/04

  15. Query Answering Algorithm • Problem • Evaluate query Q at P1 given a network of mappings • Reformulate the query over all relevant peers • Chaining of mappings using a combination of query composition and query rewriting • QP1(x) :- DbResearcher(x) • Query Composition • M:DbResearcher(x)Researcher(x),Area(x,DB)  QP2 (x) Researcher(x),Area(x,DB) • Query Rewriting • M: DbResearcher(x), Office(x,DBLab) =DbLabMember(x)  QP3 (x) DbLabMember(x) Peer-to-Peer Information Systems – WS 03/04

  16. Query Reformulation (1) Query: Mapping: • <S2> • <people> {: $people=/S1/people :} • <faculty> {: $name=$people/faculty/name/text():} • { $name} • </faculty> • <student>{: $student=$people/student/text():} • <name> { $student } </name> • <advisor> {: $faculty=$people/faculty, • $name=$faculty/name/text(), • $advisee=$faculty/advisee/text() • where $advisee=$student :} • { $name } • <advisor> • </student> • </people> • </S2> <result> { for $faculty in /S1/people/faculty, $name in $faculty/name/text(), $advisee in $faculty/advisee/text() where $name = “Ullman” return <student> {$advisee} </student> } </result> Peer-to-Peer Information Systems – WS 03/04

  17. <result> S1 people faculty faculty name <faculty> {$name} name advisee $name = “Ullman” <student> {$advisee} student <student> <name> {$student} Query Reformulation (2) Query tree pattern: Mapping tree pattern: Query: <S2> <result> { for $faculty in /S1/people/faculty, $name in $faculty/name/text(), $advisee in $faculty/advisee/text() where $name = “Ullman” return <student> {$advisee} </student> } </result> S1 <people> people faculty name advisee $advisee=$student <advisor> {$name} Peer-to-Peer Information Systems – WS 03/04

  18. <result> S1 people faculty faculty name <faculty> {$name} name advisee $name = “Ullman” <student> {$advisee} student <student> <name> {$student} Query Reformulation (3) Query tree pattern: Mapping tree pattern: Query: <S2> <result> { for $faculty in /S2/people/student, $advisor in $student/advisor/text(), $name in $student/name/text() where $advisor = “Ullman” return <student> { $name } </student> } </result> S1 <people> people faculty name advisee $advisee=$student <advisor> {$name} Peer-to-Peer Information Systems – WS 03/04

  19. Reformulation times • Table 1: The test queries and their respective running times. Peer-to-Peer Information Systems – WS 03/04

  20. Current and the Future • Current status • Demo scenario using XML • Looking at real domains (Bio dbs, NASA dbs) • Future Work • More efficient reformulation algorithm • Semantic network analysis – eliminate redundant mappings and inconsistent mappings • Query caching to speed up query evaluation Peer-to-Peer Information Systems – WS 03/04

  21. Conclusions • Mapping language for mapping between sets of XML source nodes with different document structures • Architecture that uses the transitive closure of mappings to answer queries • Algorithm for query answering over this transitive closure of mappings, which is able to follow mappings in both forward and reverse directions Peer-to-Peer Information Systems – WS 03/04

  22. Thank You! Peer-to-Peer Information Systems – WS 03/04

  23. Further literature • Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov: Schema Mediation for Large-Scale Semantic Data Sharing • Igor Tatarinov, Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Suciu, Nilesh Dalvi, Xin (Luna) Dong, Yana Kadiyska, Gerome Miklau, Peter Mork: The Piazza Peer Data Management Project • Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov: Schema Mediation in Peer Data Management Systems • Alon Halevy, Oren Etzioni, AnHai Doan, Zachary Ives, Jayant Madhavan, Luke McDowell, Igor Tatarinov: Crossing the Structure Chasm • Madhan Arumugam, Amit Sheth, and I. Budak Arpinar: Towards Peer-to-Peer Semantic Web: A Distributed Environment for Sharing Semantic Knowledge on the Web • Hendler J., Berners-Lee T., Miller E.: Integrating Applications on the Semantic Web Peer-to-Peer Information Systems – WS 03/04

More Related