1 / 19

KBP2014 Entity Linking Scorer

KBP2014 Entity Linking Scorer. Xiaoman Pan, Qi Li, Heng Ji , Xiaoqiang Luo , Ralph Grishman jih @ rpi.edu. Overview. We will apply two steps to evaluate the KBP2014 Entity Discovery and Linking results. 1. Clustering score 2. Entity Linking ( Wikification ) F-score. Overview.

talli
Download Presentation

KBP2014 Entity Linking Scorer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KBP2014 Entity Linking Scorer Xiaoman Pan, Qi Li, HengJi, XiaoqiangLuo, Ralph Grishman jih@rpi.edu

  2. Overview • We will applytwostepstoevaluatetheKBP2014EntityDiscoveryandLinkingresults. 1.Clusteringscore 2.Entity Linking (Wikification)F-score

  3. Overview • We use a tuple ⟨doc-id, start, end, entity-type, kb-id⟩to represent each entity mention, where a special type of kb-id is NIL. • Let s be an entity mention in the system output, g be an corresponding gold-standard. An output mention s matches a reference mention giff: 1. s.doc-id = g.doc-id, 2. s.start = g.start, s.end = g.end, 3. s.entity-type = g.entity-type, 4. and s.kb-id = g.kb-id.

  4. Clustering score • Inthisstep,weonlyconcernclusteringperformance.Thus,change all mentions’ kb_id to NIL. • We will apply three metrics to evaluate the clustering score • B-Cubed metric • CEAF metric • Graph Edit Distance (G-Edit)

  5. B-Cubed

  6. B-Cubed

  7. CEAF • CEAF (Luo 2005) • Idea: a mention or entity should not be credited more than once • Formulated as a bipartite matching problem A special ILP problem efficient algorithm: Kuhn-Munkres • We will use CEAFm as the official scoring metric because it’s more sensitive to cluster size than CEAFe

  8. CEAF (Luo, 2005)

  9. G-Edit • The first step, evaluate the name mention tagging F-measure • The second step, evaluate the overlapped mentions by using graph-based edit distance • To do: • Find a theoretical proof that the connected component based method is the unique optimal solution • Try to make the matching more sensitive to cluster size

  10. WikificationF-score • Existing Entity Linking (Wikification)F-score

More Related