1 / 13

Named Entity Tagging with Conditional Random Fields

Named Entity Tagging with Conditional Random Fields. Ryan McDonald, Fernando Pereira and Fei Sha Computer and Information Science University of Pennsylvania. Goals. Improve on the results of the current NE tagger used by UPenn ACE

tocho
Download Presentation

Named Entity Tagging with Conditional Random Fields

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Named Entity Tagging withConditional Random Fields Ryan McDonald, Fernando Pereira and Fei Sha Computer and Information Science University of Pennsylvania

  2. Goals • Improve on the results of the current NE tagger used by UPenn ACE • Accomplish through Conditional Random Field Model (Lafferty et al. 2001) • Compare MaxEnt and CRFs in a controlled environment

  3. ACE Definition • Find entities and classify them as Person, GPE, Organization, Location and/or Facility • “Bush took over the White House from the Clinton Administration” • Bush: Person • White House: Facility, GPE • The Clinton Administration: Organization • Clinton: Person

  4. MaxEnt vs. CRFs • Ran an MEMM tagger and a CRF tagger with: • The exact same features • Exact same training algorithm (limited memory quasi-Newton) • Exact same training data and test data • Have not used Sept. test data yet since more improvements on the way

  5. Features • Word: Unigram* • 1-suffix, 2-suffix, 3-suffix and 4-suffix: Unigram and Bigram • Word length bins: Unigram and bigram • Word features defined by Tom's script: Caps, Numeric, etc.* * used in original ACE system

  6. MEMM vs. CRF • Same feature set • Same training algorithm

  7. ACE vs. CRF • Different feature sets (CRF is richer)

  8. Summary • These results and (Sha 2002) show that CRFs perform slightly better than MEMMs • Richer feature set leads to larger improvement • Portable CRF, MEMM code • Congugate Gradient, Limited Memory Quasi-Newton, Perceptron

  9. Future and Current Work • “Person” and “Organization” recall • Multilayer taggers • Name lists • Document class information

  10. Multilayer Taggers • If entity information known, can lead to a 10-20% increase in F-Score • First layer of tagger attempts to find generic entities • Can achieve around F-Score of 0.87 • Second layer uses entity information as feature for each category classifier • Leads to about a 2-5% increase in F-Score

  11. Name Lists • Aim is to increase Recall results for person and organization categories • Name list size: 80,000 • Organization list size: 30,000 • Binary feature: is token in name list? • Increase Person F-Score to 0.793 (From 0.755) • Binary feature: is token in organization list? • Increase Person F-Score to 0.601 (From 0.569)

  12. Name Lists • Small name lists can lead to a substantial improvement in F-Score • Even features were simplistic • Investigating better name lists • MT name list of 500,000 names and 50,000 orgs • Investigating more sophisticated features • frequency

  13. Document Class Features • “Atlanta defeated Florida in extra innings ...” • Atlanta and Florida should be tagged as organizations • Mistakenly tagged as GPE • If document classified as SPORTS, NE classifier may recognize things normally tagged GPE should be orgs • Currently beginning to look at state of the art document classification algorithms • Could provide a richer source of knowledge

More Related