330 likes | 424 Views
The Index Poisoning Attack in P2P File Sharing Systems. Keith W. Ross Polytechnic University. Jian Liang. Naoum Naoumov. Joint work with:. Internet Traffic. CF: CacheLogic. File Distribution Systems: 2005. Attacks on P2P: Decoying. Two types: File corruption: pollution
E N D
The Index Poisoning Attack in P2P File Sharing Systems Keith W. Ross Polytechnic University
Jian Liang Naoum Naoumov Joint work with:
Internet Traffic CF: CacheLogic
Attacks on P2P: Decoying Two types: • File corruption: pollution • Index poisoning Investigated in two networks: • FastTrack/Kazaa • Unstructured P2P network • Overnet • Structured (DHT) P2P network • Part of eDonkey
File Pollution original content polluted content pollution company
File Pollution pollution server pollution company file sharing network pollution server pollution server pollution server
File Pollution Unsuspecting users spread pollution !
File Pollution Unsuspecting users spread pollution ! Yuck
Index Poisoning index titlelocationbigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20 23.123.78.6 123.12.7.98 file sharing network 234.8.89.20
Index Poisoning index titlelocationbigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20 index titlelocationbigparty 123.12.7.98smallfun 23.123.78.6heyhey 234.8.89.20bighit 111.22.22.22 23.123.78.6 123.12.7.98 234.8.89.20 111.22.22.22
Overnet: DHT • (version_id, location) stored in nodes with ids close to version_id • (hash_title, version_id) stored in nodes with ids close to hash_title • First search hash_title, get version_id and metada • Then search version_id, get location
Overnet 0001 0011 1111 0100 Publish Query 1100 0101 Download 1010 1000
FastTrack Overlay ON = ordinary node SN = super node SN ON ON ON Each SN maintains a local index
FastTrack Query ON = ordinary node SN = super node SN ON ON ON
FastTrack Download ON = ordinary node SN = super node HTTP request for hash value SN ON ON ON
FastTrack Download ON = ordinary node SN = super node P2P file transfer SN ON ON ON
Attacks: How Effective? • For a given title, what fraction of the “copies” are • Clean ? • Poisoned? • Polluted? • Brute-force approach: • attempt download all versions • For those versions that download, listen/watch each one • How do we determine pollution levels without downloading?
Titles, versions, hashes & copies • The title is the title of song/movie/software • A given title can have thousands of versions • Each version has its own hash • Each version can have thousands of copies • A title can also have non-existent versions, each identified by a hash
Definition of Pollution and Poisoning Levels • (t, t+ Δ): investigation interval • V: set of all versions of title T • V1, V2, V3: sets of poisoned, polluted, clean versions • Cv: number of advertised copies of version v
How to Estimate? • Need Cv, vєV • Need V1, V2, V3 • Don’t want to download and listen to files! Solution: • Harvest Cv, vєV, and copy locations • Overnet: Insert node, receive publish msg’s • FastTrack: Crawl • Heuristic for V1, V2, V3
Copies at Users FastTrack Overnet
Identify heavy and light publishers Hh = set of hashes from heavy publishers Hl = set of hashes from light publishers Heuristic polluted versions Hh Hl clean versions poisonedversions
Heuristic: More Heuristic is accurate & does not involve any downloading!
Blacklisting • Assign reputations to /n subnets • Bad reputation to subnets with large number of advertised copies of any title • Obtain reputations locally; share with distributed algorithm • Locally blacklist /n subnets with bad reputations
The Inverse Attack • Attacks on P2P systems: • But can also exploit P2P sytems for DDoS attacks against innocent host: