1 / 23

FreeNet: A Distributed Anonymous Information Storage and Retrieval System

FreeNet: A Distributed Anonymous Information Storage and Retrieval System. Ian Clark, Oskar Sandberg, Brandon Wiley and Theodore Hong. FreeNet. P2P network for anonymous publishing and retrieval of data Decentralized Nodes collaborate in storage and routing Data centric routing

ezhno
Download Presentation

FreeNet: A Distributed Anonymous Information Storage and Retrieval System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FreeNet: A Distributed Anonymous Information Storage and Retrieval System Ian Clark, Oskar Sandberg, Brandon Wiley and Theodore Hong

  2. FreeNet • P2P network for anonymous publishing and retrieval of data • Decentralized • Nodes collaborate in storage and routing • Data centric routing • Adapts to demands • Addresses privacy & availability concerns

  3. Motivation • Problem - Querying the network • Source - Requestor • Destination – Provider • It’s a distributed search problem • Approximating global knowledge with local knowledge • Other systems – Chord, Tapestry, Pastry • Privacy and availability • Protect authorship, prevent denial attacks

  4. Goals of Freenet • Anonymity for producers and consumers • Deniability for information storers • Resistance to denial attacks • Efficient storing and routing • Does NOT provide • Permanent file storage • Load balancing • Anonymity for general n/w usage

  5. Architecture • Request: • key • Hops to live • ID • Depth • Each node – local data store + routing table • Request file through location independent keys • Routing - chain of proxy requests - decision is local • Graph structure actively evolves over time

  6. Key Based Searching • Keyword signed key(KSK) • Easy for retrieval – only need ‘D’ • Minimal protection against tampering D ‘D’– key generation Pb + Pr ; SHA(Pb) FILE + Pr E(FILE, D) Signature KSK Encrypted FILE

  7. Keys and Searching….. • Problems with KSK – flat namespace (collisions), key squatting, dictionary attacks • Signed Subspace Key (SSK) • Randomly generated key pair  namespace ID • SSK = SHA(‘D’) ^ SHA(Pb) • (-)Advertisement – subspace Pb + ‘D’ • (+)Owner can construct hierarchical space of arbitrary depth - using indirect files • (+)Reduces collision greatly

  8. Keys and Searching… • Problems with SSK - updating, versioning • Content Hash Keys (CHK) • Encrypted by a random encryption key • Publish CHK + decryption key • CHK + SSK  easily updateable files • 2 step process – publish file, publish pointer • Results in pointers to newer version • Older versions accessed thru CHK • Can be used for splitting files

  9. Retrieving Files • How do u locate the keys? • Hypertext spider • Indirect files – published with KSK of search words • Publish bookmarks • File retrieval • Request forwarded to node in RT with closest lexicographic match for the binary key • Request routing follows steepest-ascent hill climbing: first choice  failure  backtrack  second choice

  10. c a b f e d Still Retrieving…. • Timers, hops - curtail request threads • Files cached all along the retrieval path • Self-reinforcing cycle – results in key expertise

  11. Ring Topology • 1000 nodes in ring topology • Datastore = 50 items • RT = 250 items • Keys associated with links are hash of destn IPs

  12. Self Reinforced Routing • Snapshots using 300 requests with hops = 500 • As network converges it drops to 6 - “six degrees of separation”

  13. Retrieval Discussion • No controlled replication  no persistence • No correlation between keys and content • (+) Documents related to a subject are scattered • Geographical fault resilience • (-) No spatial locality – search latencies can suffer • Building indexes by other means

  14. Publishing • Similar to retrieval but, 2 step process • Detect collisions – ‘all clear’ if no collision • Publish to node in RT with closest key match • Are CD and publish paths same? • Can result in collision during publish step • Inserts allow new nodes to advertise themselves • (+) Key-squatting is not effective

  15. Data Management • Finite data stores - nodes resort to LRU • Routing table entries linger after data eviction • Outdated (or unpopular) docs disappear automatically • Bipartite eviction – short term policy • New files replace most recent files • Prevents established files being evicted by attacks

  16. Network Growth • New nodes have to know one or more guys • Problem: How to consistently decide on what key the new node specializes in? • Needs to be consensus decision – else denial attacks • Advertisement  IP + H(random seed s0) • Commitment - H(H(H(s0) ^ H(s1)) ^ H(s2))……. • Key for new node = XOR of all seeds • Each node adds a RT entry for the new node

  17. Network Growth • Key assigned to new nodes = H(IP) • Scales as log(n) until n ~ 40000 • At 40000, RTs are full

  18. Protocol • Nodes with frequently changing IPs use ARKs • Return address specified in requests – threat? • Messages do not always terminate when hops-to-live reaches 1 • Depth is initialized by original requestor to arbitrarily small value • Request state maintained at each node – timers - LRU

  19. Fault Resilience • Median path length < 20 at 30% node failures? • N/w becomes ineffective at 40% failures ???

  20. Small World • Most nodes form local clusters • Few high link connecting nodes • Power law distribution provides high degree of fault tolerance

  21. Security Concerns • Pre- routing – mesg. encrypted by public keys which determine path of pre-routing • Protecting data source – using random and probabilistic methods

  22. Security • File integrity - KSK vulnerable to dictionary attacks • DOS attacks – Hash Cash to slow down • Attempts to displace valid files are constrained by the insert procedure

  23. Conclusion • Provides a n/w to anonymously store and request files • Adaptive routing who’s efficiency increases with experience • Deals with privacy and data integrity in various scenarios • Applications? • Freedom of speech • Unaccountable, decentralized Napster

More Related