A Method of Rating the Credibility of News Documents on the Web

A Method of Rating the Credibility of News Documents on the Web Ryosuke Nagura, Yohei Seki Toyohashi University of Technology Aichi, 441-8580, Japan nagu@kde.ics.tut.ac.jp, seki@ics.tut.ac.jp Noriko Kando National Institute of Informatics Tokyo, 101-8430, Japan kando@nii.ac.jp Masaki Aono Toyohashi University of Technology Aichi, 441-8580, Japan aono@ics.tut.ac.jp SIGIR’06, August 6–11, 2006, Seattle, Washington, USA.

ABSTRACT • INTRODUCTION • METHOD -Commonality -Numerical Agreement -Objectivity • EVALUATION • CONCLUSION

INTRODUCTION • When a user collects information from the Web, the information is not always correct. • Many lies are placed on electronic bulletin boards or in blogs by malicious persons. • The purpose of this study is to propose a method to rate the credibility of the information in news articles on the Web.

Commonality • The more news publishers delivered articles with similar content to the target article being assessed, the higher the credibility was rated. • Computed the cosine similarities between sentences in all the articles from different news publishers. • If sentence similarities exceeded a threshold, we defined the sentences as “similar”.

Commonality

Example • Score = (1/3+1/3+2/3+0)/4 = 0.33 (1) 網站 A 網站 B 網站 C 網站 D 的待測文件 A F G H A K L M E B C T Y U Q S X V T C E F W Q R Y T A B C Z

Numerical Agreement • Numerical expressions such as “100 passengers” or “three tracks” occur in news reports. When numerical expressions contradicted those in other articles from different news publishers, the credibility was rated lower.

Numerical Agreement

Example • Score = (3 – 1*2) / 5 = 0.2 (2) 網站 B 網站 C 網站 A 網站 D 的待測文件 1 A 2 F 3 C 4 G 5 I 6 J 7 K 1 A 2 B 3 C 5 P 6 G 7 Z 8 X 1 A 2 B 4 C 5 E 5 F 6 Z 7 G 1 A 2 B 3 C 4 D 5 E

Commonality and Numerical Agreement • When the added scores of (1) and (2) exceeded the threshold σ(= 0.5), we categorized them as candidates for credible articles. • 0.33 + 0.2 = 0.53 >σ(= 0.5)

Objectivity • For candidates for credible articles, the credibility score was rated using a list of speculative clue phrases and the indication of the news sources in the articles. • All the sentences in the article were rated using the score of the speculative clue phrases they contained.

Objectivity

Objectivity • We categorized news source types into four: (a) news agencies and publishers (b) government agencies (c) police and (d) TV/radio.

Objectivity • For the news source, we raised the objectivity score if news sources were given in the article. • The appearance of news agencies such as “Associated Press” raised the objectivity higher than the appearance of “TV/radio”.

EVALUATION • 55,994 news stories on the Web (published from October to November, 2005) were collected every 30 minutes from seven news sites (four newspaper sites, one TV news site, one overseas news agency site, and one evening daily site ; all stories were written in Japanese).

EVALUATION • From the collected articles, 52 articles were selected manually for human assessment. • We manually categorized the 52 articles into 8 topic groups as shown in Table 2 and divided them as having high (4, 3) or low (2,1) credibility scores calculated by our proposed method,

EVALUATION • Combined them into 26 pairs that were published in the same time period. We then asked three assessors to compare credibilities. • Agreement rates = # of agreements / # of article pairs

Example • 有 3 pairs A(4,3) B(1,3) C(4,4) • 若人感覺 A1:高 A2:高 (O) B1:高 B2:低 (X) C1:高 C2:低 (X) Agreement rates = 1 / 3 = 0.33

EVALUATION

CONCLUSION • In an experiment with three assessors, the average agreement between our proposed method and human assessments was 69.1%. • However, it tends to rate a “scoop” rather lower when the method is applied to developing news stories.

A Method of Rating the Credibility of News Documents on the Web

A Method of Rating the Credibility of News Documents on the Web

Presentation Transcript

Credibility on the Internet

The Life of a Naval Rating

The Credibility of the City of Moscow

Assessing the Credibility of Sources

The Life of a Naval Rating

The Research on Credibility of Knowledge Management System

On judging the credibility of climate predictions

Determining the Credibility of a Source

Mashups: building multimedia documents on the Web

On judging the credibility of climate predictions

Factor rating method of location

Using a Sentiment Map for Visualizing Credibility of News Sites on the Web

Online credibility in the news

Organisation of documents on Web

Documents on the Web – Adobe Acrobat

Quality of web documents

The method of Web Service Process modeling

Presence of News on the Internet

The uncomplicated method of getting documents notarized promptly

The Credibility of Employment Opportunities

Information Credibility on the Web

The Latest Trending news on Hours of News