290 likes | 302 Views
.: DRAFT :. Collaborative Content Delivery. A peer-to-peer solution for web-based publish/subscribe. Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University. Presentation duality …. The case for Collaborative Content Delivery vs
E N D
.: DRAFT :. Collaborative Content Delivery A peer-to-peer solution for web-based publish/subscribe Werner VogelsRobbert van Renesse, Ken BirmanDept. of Computer Science, Cornell University
Presentation duality … • The case for Collaborative Content Delivery vs • The innovative technology used to build the system • Spectacularly scalable technology • Secure, reliable, robust & fast • A solution to many distributed management problems
Late night reading Epidemic Theory of Infectious Diseases and its ApplicationsN.T.J. BaileyHafner PressSecond Edition, 1975
The Problem • Access to real-time information at syndicated news sites is highly inefficient • An estimated 70%-80% of the bandwidth is wasted on redundant transport both at the consumer and at the publisher • Consumers frequently return to the website to receive timely updates
Isn’t this solved already? • RSS – channels provide summaries for processing by bots. • But the mechanism remains “pull” • HTTP – Delta should reduce bw cost • News feeds from major vendors • “push” is the right model for frequently changing data with timely delivery • Proprietary formats and high fees • Email summary as cheap alternative • Still high bandwidth cost at the publisher • Hybrid “push/pull” by organizations exploiting distributed content delivery
Scale is a major obstacle • No coordinated action by syndication sites to provide shared information push infrastructure • The one-to-many technologies used currently are inherently not scalable • No technology is available that can deliver data from thousands publishers to millions of subscribers in real-time.
We can do better • Current push solutions fail to exploit the collaborative power of the Internet • Ideally the publishers inject one update into the world and all interested subscribers will receive this. • In this model all consumers are collaborating to route the information to right subscribers • The information arrives at all desktops within tens of seconds after publishing
Peer-to-Peer Solution • P2P is the only approach to a cost effective, scalable solution • Subscribers weave an ad-hoc infrastructure for subscription based routing • Scalable, autonomous & decentralized management • High level of robustness and reliability in message delivery • Authentication of publishers
Emerging technologies • Astrolabe, CAN, Cord, Pastry, are emerging research technologies. • Astrolabe the furthest in • Scalability • Security integration • Manageable • Firewall, proxy and NAT support • Complete technology that we are now using to develop applications
Astrolabe/Mariner • A system for ultra-scalable, distributed state management • Robust, through the use of epidemic techniques • Scalable, through the use of information aggregation and fusion • Secure, through certificates • Flexible, through secure mobile code • Simulated, Emulated, Tested and Deployed.
Astrolabe Robust and Scalable Technology for Distributed System Monitoring, Management and Data Mining
Distributed Systems Management • Is extremely important in the deployment of large systems • Scalable management of applications and systems is still a major Quest • Management technology needs to be integrated into applications • The management subsystem is often more complex than the application itself
Astrolabe • Information/state management system • Monitors the dynamically changing state of sets of distributed resources • Reports summaries to its consumers • Uses information hierarchies to organize the data • Uses aggregation techniques to continuously compute the summary nodes in the system
Current use of Mariner • Monitor and control applications, systems and infrastructure • Resource discovery • Collaboration management • Coordination of distributed tasks • Edge-caching control • CDN dynamic management
Intuitively • You can see mariner as a large database with information about the global system • None of this information resides on a single server • Each principal has a row in the virtual database in which it is allowed to update with <attribute, value> pairs. • A principal can only directly access the rows of other nodes in its zone and its intermediate nodes in the hierarchy to the root.
Mariner in a single zone • Lowest level in the hierarchies can be nodes or finer grained if the application requires it • Security key for zone needed to add a new column; user key needed to update row
Scalability through Hierarchy • Leafs are organized into zones • Each leaf has a self-managed attribute list • The base zone is the collection of individual attribute lists of its leafs • Each intermediate zone is the collection of attribute list constructed out of aggregation of the information in its child zones • Each list has some basic attributes, that Mariner uses to manage itself such contact lists, timestamps, etc.
Simple Hierarchy New Jersey San Francisco
Information Aggregation • Aggregation functions are programmable • Subset of SQL • Code is embedded in aggregation function certificates (AFC) • Signed certificate is installed into an attribute list • Used to construct (new) attributes in zones of the hierarchy
Epidemic Dissemination • Each Astrolabe instance maintains all the zones on its path to the root • No centralized servers for intermediate zones • Consequently each instance has a copy of the root zone • Replication is achieved through gossip techniques. • Guarantees eventual consistency
AFC propagation • Output of the AFC includes a copy of it self – results in a copy of the AFC into the parent zone • Reaches the root and other zone leafs • Adoption – check the ancestors lists to find new AFC’s • Spreads through the system in the order of tens of seconds. • Certificates have an expiration date, unless refreshed aggregation eventually halts
I’ll skip • Aggregation function details • Mobile code details • Eventual consitency • Certificates • Authentication • Firewalls, & nat’s
Robustness through Gossip • Use of Epidemic Techniques to disseminate data and AFC’s • Pure peer-to-peer communication • Full autonomous progress • Actions based on probability theory • Robustness improves with scale • Fixed low overhead, independent of scale • Control as well as Data transport
Gossip • Conceptually: each zone periodically picks another zone at random and exchanges the state of those zones • Slightly more complex because there are virtual zones …
Gossip target selection • Each instance update the issued attribute, evaluates depending AFC’s • An agent (instance) will gossip on behalf of those zones for which it is a contact, with a rate depending on configuration • At each level pick at random a child from the contact list and exchange state
Membership • Failure detection • If no update seen for an agent in time Tfail, remove it from the system • Integration • After partitions, crashes, etc. renegate trees can be formed • Use of broadcast, multicast, hints, to discover other agents
Subscription routing • At the leafs the subscribers store subscription information • Aggregation functions combine the subscriptions of participants into subscriptions for the zone • Publishers use zone.send(subscription, data) which is forwarded if the zone has children that match the subscription
Routing infrastructure • Each zone dynamically selects 2-3 routing nodes using AFC’s using various load factors • These nodes receive news items for their children in their zone • Forwarding based on the individual subscription information • Redundancy used to achieve robustness and reliability