Detecting spam in a Twitter network ( 과제 세미나 ) 2012.5.11

Detecting spamin a Twitter network(과제 세미나) 2012.5.11 정보보증 연구실 석사 23기 윤수진

목차 기존 연구 실험 설계 결과 결론 – 과제와의 연관

기존 연구 • E-mail에서의 spam 탐지 • (Feamster, 2008) message 검사, IP 검사를 통해 IP-based blacklist • (Ramachandran, 2007) Behavioral blacklisting • (Famster, 2008) Spatial, temporal traffic pattern

가정 • High following to friend ratios • Retweet and change legitimate links to illegitimate ones • Temporal patterns • 측정 • Age, frequency of tweets, ratio of friend-follower, clustered, location on network structure

실험설계 • #robotpickuplines안에서 4일동안 올라오는 tweet을 data set으로 함 • Hashtag : 트윗 내에 #키워드 식으로 적어서 해당 트윗에 태그를 붙이는 방식. 해당 hashtag에 들어가면 태그가 된 트윗들을 전부 볼 수 있다

실험 • Hashtag lifecycle – 24시간 • 17,803 트윗, 8,616 사용자 • External URL • URL shortener로 알아보기 힘듦 • Spammer 특징 • 이질적인 hashtag가 많음 • ID형식이 letter+number • 외설적 키워드

실험 • Algorithm based.. • Searches for URLs • Username pattern matches • Keyword detection • 300개의 트윗을 임의로 뽑아한 결과 • 27개의 spam을 놓침 • 12개의 정상 트윗을spam으로 판정 • 91%spam을 찾아냄

실험 중 발견 • Hashtag내 시간에 따른 트윗갯수

자료 분석 • 전체 tweet 중 14%가 spam • Trend가 spam서는 조금 늦게 나타남

실험 결과 • Age – 차이 별로 없음 • Frequency of tweets • Spam: 8.66, legitimate: 6.7 • Retweet, @reply는 차이가 미미 • Chi-squared test에서 의미가 없음을 밝힘 • Ratio • 차이 별로 없음 • 그러나, 평균적으로 friend, follower 숫자가 spammer가 더 많음

실험 결과 • Clustered • 차이 별로 없음 • 다만, spam이 상대적으로 적은 숫자로 clustering 되어 있다 • Location on network structure • Spammer는 follower로 edge에 있는 경향이 있음 : 5번 다리를 건너면 100개의 edge 중 63개가 spammer • Legitimacy : high indegree • Spam : high outdegree

과제와의 연관

Detecting spam in a Twitter network ( 과제 세미나 ) 2012.5.11

Detecting spam in a Twitter network ( 과제 세미나 ) 2012.5.11

Presentation Transcript

Spyware Spam Phishing

DETECTING AND PREVENTING MONEY LAUNDERING

Detecting Degradation in DNA samples

Detecting Outliers

So setzt man Twitter im Unternehmen ein – Schritt für Schritt zum Unternehmensaccount

Everyday inductive leaps Making predictions and detecting coincidences

How to Deliver eMails in a SPAM Filtered World

Detecting boson-vortex duality in the cuprate superconductors

Transforming the system to improve quality and reduce costs 18 th May 2012 Helen Bevan @helenbevan http://twitter.com/

Advanced Twitter for Job Seeking

FRAppE : Detecting Malicious Facebook Applications

MD Liaison Roanoke Insurance Network Meeting

Detecting and Monitoring Hazardous weather

Winning the War on Spam

Unwanted Traffic: Denial of Service and Spam email

26.02.2012

Chapter 4 Network Layer

Sensation and Perception

26.02.2012

أمن وحماية المعلومات

Network Analysis and Design

Eigen Representations: Detecting faces in images