1 / 20

Efficiently Incorporating User Feedback into Information Extraction and Integration Programs

Efficiently Incorporating User Feedback into Information Extraction and Integration Programs. Xiaoyong Chai, Ba-Quy Vuong, AnHai Doan, Jeffrey F. Naughton University of Wisconsin-Madison. The Need for Incorporating User Feedback. Panels Chair. Current Approach. Code. Data. …. 3.

may
Download Presentation

Efficiently Incorporating User Feedback into Information Extraction and Integration Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficiently Incorporating User Feedback into Information Extraction and Integration Programs Xiaoyong Chai, Ba-Quy Vuong, AnHai Doan, Jeffrey F. Naughton University of Wisconsin-Madison

  2. The Need for Incorporating User Feedback Panels Chair

  3. Current Approach Code Data … 3

  4. This Is Not Just For DBLife • A growing number of applications use IE and II • Avatar@IBM Almaden • AliBaba@Humboldt Univ. of Berlin • YAGO@MPI • Kylin@Univ. of Washington • … • A systematic user-feedback solution could significantlybenefit them 4

  5. What User Feedback To Incorporate? Types of User Feedback Flagging an Error Fixing an Error Editing Code Editing Data Input Output IntermediateResults 5

  6. Challenges • How to expose program data for user feedback? • How to incorporate user feedback? • How to efficiently execute a program? 6

  7. Exposing Program Data for User Feedback name conf role name conf role … … … Joe Hellerstein CIDR 2009 PC Chair … … … name role page … … … name role page … … … url date url … http://.../cidr09/ 09/01/2008 … … • Extracting conference services User Interfaces Views services Wiki roles findRoles extractConf Spreadsheet extractNames crawl Form dataSources 7

  8. Writing User-Feedback Rulesto Expose Program Data Write extraction program, e.g., in xlog [Shen et al, 07] R1: pages(page) : dataSources(url, date), crawl(url, page) R2: conferences(conf, page): pages(page), extractConf(page, conf) R3: names(name, page) : pages(page), extractNames(page, name) R4: roles(name, role, page) : names(name, page), findRoles(name, page, role) R5: services(name, conf, role) : conferences(conf, page), roles(name, role, page) • Write user-feedback rules to specify views and user interfaces #form-UI R6: dataSourcesForUserFeedback(url): dataSources(url, date), date >= “01/01/2009” R7: rolesForUserFeedback(pos, page#no-edit)#spreadsheet-UI : roles(role, page) R8: servicesForUserFeedback(name, conf, role)#wiki-UI : services(name, conf, role) 8

  9. Program Semantics User Interfaces Views services name conf role name conf role Wiki … … … Joe Hellerstein CIDR 2009 PC Chair … … … roles name role page … … … name role page Spreadsheet … … … extractConf extractNames findRoles crawl url date url Form … http://.../cidr09/ 09/01/2008 … … dataSources 9

  10. Incorporating Previous User Feedback p p tt’ O O O’ I I Interpretation: for operator p, if t is in the output, change t into t’ Change “A. Smith” to “D. Smith” extractNames extractNames … D.Smith, A.Jones, ... Dr. A. Smith is ...… … 10

  11. Interpreting User Feedback Based On Tuple Provenance page p1 p2 • Provenance of output tuple t : • the set of input tuples that operator p used to produce t Change “A. Smith” to “D. Smith” p1 p1 p1 p1 p2 If the operator produces {“A. Smith”, “A. Jones”} from {p1}, extractNames extractNames then replace{“A. Smith”, “A. Jones”} with {“D. Smith”, “A. Jones”} 11

  12. Challenges • How to expose program data for user feedback? • How to incorporate user feedback? • How to efficiently execute a program? • Incremental execution • Improved concurrency control 12

  13. Incrementally Executing the Program extractNames(I+I) = extractNames(I) + extractNames extractNames extractNames(I) ? name … page page p1 p1 p2 p2 p3 • Similar problem in incremental view maintenance • Incremental-update properties • Closed-formed insertion • Closed-formed deletion • Input partitionability • Partition correlation • Attribute independence 13

  14. Concurrently Executing Transactions name conf role Joe Hellerstein CIDR 2009 PC Chair … … … name role page … … … url date http://.../cidr09/ 09/01/2008 … … services Operator-Skipping Skips executing the join operator after updating the roles table roles T2 findRoles extractConf extractNames Table-Locking Locks only the input and output tables of the crawl operator crawl T1 dataSources 14

  15. Experiment Setup • Testbed • A 5-stage DBLife workflow • 13 blackbox operators: 6 IE operators and 3 II operators • Wrote xlog program and user-feedback rules in < 1 hr • Simulated user-feedback transactions • On each stage of the workflow • Each transaction randomly deletes, inserts, or modifies1/10 of the tuples in a table 15

  16. Incremental-Update Properties are Broadly Applicable 16

  17. Incremental Update Reduces Execution Time 17

  18. Table-Locking and Operator-Skipping Improve Concurrency Degree Increase transaction throughput by 50% and 500% • Reduce transaction response time by 43% and 98% -43% -98% 18

  19. Related Work User feedback in IE and II [Doan et al, 01], [Chiticariu et al, 08], [Jeffery et al, 08] Leveraging user feedback to improve results of individual operations Provenance [Woodruff & Stonebraker, 97], [Cui & Widom, 01], [Buneman et al, 01], [Bohannon et al, 08] ], [Huang et al, 08] Incremental execution View maintenance [Blakeley et al, 86], [Griffin & Libkin, 95], [Gupta & Mumick, 95] Schema matching [Bernstein et al, 06], IE [Chen et al, 07] 19

  20. Conclusions and Future Work Incorporating user feedback into IE and II programsis important Identify key issues and provide initial solutions: Write user-feedback rules to expose program data to UIs Model and incorporate user feedback Efficiently execute program to process user feedback Future work: Handle unreliable user feedback Propagate user feedback down in the workflow Conduct user study 20

More Related