1 / 11

Detecting Web Spam through Backward Propagation of Distrust

Detecting Web Spam through Backward Propagation of Distrust. CS315-Web Search and Mining. And Now For Something Completely(?) Different. Propaganda : Attempt to modify human behavior, and thus influence people ’ s actions in ways beneficial to propagandists Theory of Propaganda

doris
Download Presentation

Detecting Web Spam through Backward Propagation of Distrust

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detecting Web Spam through Backward Propagation of Distrust CS315-Web Search and Mining

  2. And Now For Something Completely(?) Different • Propaganda: • Attempt to modifyhuman behavior, and thus influencepeople’s actions in ways beneficialto propagandists • Theory of Propaganda • Developed by the Institute for Propaganda Analysis 1938-42 • Propagandistic Techniques (and ways of detecting propaganda) • Word games - associate good/bad concept with social entity • Glittering Generalities — Name Calling • Transfer - use special privileges (e.g., office) to breach trust • Testimonial - famous non-experts’ claims • Plain Folk - people like us think this way • Bandwagon - everybody’s doing it, jump on the wagon • Card Stacking - use of bad logic

  3. Web Spammers as Propagandists Web Spammers can be seen as employing propagandistic techniques in order to modify the Web Graph There is a pattern on how to spam!

  4. Anti-Spam Lessons from Society • What would you do if you realize that you should not trust a member of your trust network? YOU Famous Actress ? Democracy Your Boss Partner Joe (a plumber) NYTimes US Pres. Prof. X Mom Rev. Y ? The Coffee Joint ? X ? ? ? ?

  5. Anti-Propagandistic Lessons for Web How do you deal with propaganda in real life? Backwards propagation of distrust The recommender of an untrustworthy message becomes untrustworthy Can you transfer this technique to the web?

  6. Start from untrustworthy site s S = {s} Using BFS for depth D do: Find the set U of sites linking to sites in S (using the Google API for up to B b-links/site) Ignore blogs, directories, edu’s S = S + U Find the bi-connected component BCC of U that includes s BCC shows multiple paths to boost the reputation of s An Anti-Propagandistic Algorithm

  7. Backwards Propagation of Distrust • Start from untrustworthy site s • S = {s} • Using BFS for depth D do: • Find the set U of sites linking to sites in S (using the Google API for up to B b-links/site) • Ignore blogs, directories, edu’s • S = S + U • Find the bi-connected component BCC of U that includes s BCC shows multiple paths to boost the reputation of s

  8. BCC vs Periphery Since the BCC reveals multiple paths to boost the reputation of s, we expect it to contain a higher percentage of untrustworthy sites The Peripheryof the BCC, on the other hand, should have significantly lower percentage of untrustworthy sites Periphery BCC

  9. Explored neighborhoods

  10. Evaluated Experimental Results • The trustworthiness of starting site is a very good predictor for the trustworthiness of BCC sites • The BCC is significantly more predictive of untrustworthiness than the Periphery BCC Periphery

  11. Link Farms vs MAS

More Related