1 / 11

Spam: An Analysis of Spam Filters

Spam: An Analysis of Spam Filters. Joe Chiarella Jason O’Brien. Advisors: Professor Wills and Professor Claypool. Project Goals. To analyze the effectiveness of different kinds of spam filters. Focused on SpamAssassin and Bogofilter. SpamAssassin. Rule-based filter – over 400 rules.

Download Presentation

Spam: An Analysis of Spam Filters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spam: An Analysis of Spam Filters Joe Chiarella Jason O’Brien Advisors: Professor Wills and Professor Claypool

  2. Project Goals • To analyze the effectiveness of different kinds of spam filters. • Focused on SpamAssassin and Bogofilter

  3. SpamAssassin • Rule-based filter – over 400 rules. • Each Rule has an associated weight. • Score of an email is sum of weights across all matching rules. • User adjustable threshold.

  4. Bogofilter • Bayesian filter. • Calculates probability that an email is spam using past email. • Looks at frequency of words (not order of words). • Accuracy should improve over time.

  5. Data Collection • Email collected from students, professors, small business employees, and free email accounts. • 4626 ham emails, 5010 spam emails, separated into ham and spam mailboxes for each user.

  6. Methodology • Compared accuracy of SpamAssassin and Bogofilter for each user’s email. • Tested same number of ham emails and spam emails from each user. • Ignored results from first 50 emails to allow Bogofilter to learn.

  7. Comparison of Bogofilter and SpamAssassin on Ham CP = Company Person PR = Professor ST = Student FE = Free Email

  8. Comparison of Bogofilter and SpamAssassin on Spam CP = Company Person PR = Professor ST = Student FE = Free Email

  9. SpamAssassin Score Analysis

  10. Conclusion • Bogofilter and SpamAssassin effectiveness depend greatly on the user. • Neither filter outperformed the other in all cases. • Filtering Spam is hard.

  11. Questions?

More Related