150 likes | 318 Views
Cluster-Based Scalable Network Services. Shivaram Venkataraman Cloud computing – Fall 2011. Paper Summary. Requirements for an Internet service Scalable Available Cost-Effective Insight: Use commodity building blocks Challenges Fault tolerance Maintaining shared state
E N D
Cluster-Based Scalable Network Services ShivaramVenkataraman Cloud computing – Fall 2011
Paper Summary • Requirements for an Internet service • Scalable • Available • Cost-Effective • Insight: Use commodity building blocks • Challenges • Fault tolerance • Maintaining shared state • Load balancing, scheduling
Cluster Architecture • Pool of worker nodes for all jobs • Shared frontends, cache servers, user database • Load balanced by centralized manager • Automatic fault-tolerance, transparent to applications
Programming Model • Layered service Model • SNS: Load balancing, monitoring etc. • TACC: Composition API, caching • Service: User-interface, device customization • Build a collection of re-usable workers • Compose workers together to form a service
BASE Semantics • Basically Available, Soft-State, Eventual Consistency • Cache results - increase availability, performance • Easier to scale, handle failures • BASE semantics used for web-index, caches, load balancing information etc. • ACID semantics used for user profile information, ad-clicks
Opinions and Comments • Influence on Internet services • How relevant is this paper today ? • ACID vs BASE • Datacenter OS ? • Future Work Applications
Web Search for a planet: The Google Cluster Architecture (2003) Frontend Workers Workers
How relevant is this paper today ? • Still very relevant • Commodity building blocks • Cost/Performance ratio • Automatic fault tolerance (MapReduce, Dryad etc) • Not so relevant • Transformation proxies, aggregators etc. (TACC) • Small user generated content - Grew with Web 2.0 • Batch jobs influence on design (GFS, locality)
ACID vs BASE • BASE definition leads to CAP theorem • Services not just ACID or BASE, but a combination of both • Workload dependent • Read heavy – Web search, Maps - suited for BASE • Update heavy – Facebook, Gmail – ACID or BASE ? • Approximate answers to queries, provenance-based storage are also BASE
Datacenter OS • Is this a Datacenter OS ? • Unix-pipeline like chaining of workers • Common caching and storage • Scheduling, load-balancing manager • Programming model for applications • Yes, but only for narrow set of applications • Mesos-like design (in 1997!)
Future Work, Applications • Personalized search • Automatic event-aggregation service, e.g., Google News for events ? • Web access for PDAs • Opera Mini: Rendering HTML on server-side • Cloud-backed smartphone apps ?