1 / 12

The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?

The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?. Illustrations of different approaches Peter Clark and John Thompson Boeing Research 2004. Premise. Intelligent machines needs lots of knowledge , for question-answering intelligent search information integration

nicki
Download Presentation

The Knowledge Acquisition Bottleneck Revisited: How can we build large KBs?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Knowledge Acquisition Bottleneck Revisited:How can we build large KBs? Illustrations of different approaches Peter Clark and John Thompson Boeing Research 2004

  2. Premise • Intelligent machines needs lots of knowledge, for • question-answering • intelligent search • information integration • natural language understanding • decision support • modeling • etc. etc. • Much of this knowledge can be drawn from some general repository of reusable knowledge • e.g., WordNet • How does one build such a repository? “No-one considers hand-building a large KB to be a realistic proposition these days” [paraphrase of Daphne Koller, 2004]

  3. 1. Build it by Hand • “Let’s roll up our sleeves and get on with it!” • But: It’s a daunting task • Our own work • Cyc + Lots in it, (Relatively) well designed ontology - 650 person-years effort so far - Still patchy coverage (why?) • Difficult to use outside Cycorp

  4. 1. Build it by Hand (cont) • WordNet + Easy to use + Comprehensive • Little inference-supporting knowledge in • Ad hoc ontology

  5. 1. Build it by Hand (cont) • The Component Library Claim: can bound the required knowledge by working at a coarse-grained level + Large, more doable • Hard to use, still very incomplete

  6. 2. Extract from Dictionaries - MindNet + Automatically built • Unusable? • Extended WordNet + Won TREC competition - Still somewhat incoherent • Lot of manual labor

  7. 3. Corpus-based Text/Web Mining - Schubert’s system + Automatic + Lots of knowledge • Noisy • No word senses • Only grabs certain kinds of knowledge 30M entries…

  8. 3. Corpus-based Text/Web Mining (cont) - KnowIt (Etsioni) + automatic • only factoids

  9. 4. Community-Based Acquisition • Knowledge entry by the masses • OpenMind + Large • Full of junk, unusable (?) • Would this work with better acquisition tools? (see next slide for illustration)

  10. 5. Use Existing Resources • e.g., • databases • CIA World Fact Book • Web data/services • e.g., SRI/ISI’s ARDA QA system + Syntactically simple + Available • Largely limited to factoids • Information integration is a major challenge • different ontologies, contradictory data

  11. Where to? • Can we bound the knowledge needed • for a particular application • for a useful, sharable, general resource? • Which of these approaches seems most realistic? • build by hand • extract from dictionaries • mine text corpora • community knowledge entry • use existing resources

More Related