1 / 26

Automated, Elastic Resource Provisioning for NoSQL clusters Using TIRAMOLA

Automated, Elastic Resource Provisioning for NoSQL clusters Using TIRAMOLA. Ioannis Konstantinou CSLAB, National Technical University of Athens, Greece. Motivation – the story(1). ‘Big-data’ opts for highly scalable distributed solutions (Web) analytics, science, business

joshwa
Download Presentation

Automated, Elastic Resource Provisioning for NoSQL clusters Using TIRAMOLA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated, Elastic Resource Provisioning for NoSQL clusters Using TIRAMOLA IoannisKonstantinou CSLAB, National Technical University of Athens, Greece

  2. Motivation – the story(1) • ‘Big-data’ opts for highly scalable distributed solutions • (Web) analytics, science, business • Store + analyze everything, no matter the size • Traditional databases not up to the task • Applications suffer from highly variable and unpredictable workloads • Social networks, internet gaming, web sites, etc • Over-provisioning is costly, under-provisioning leads to outages • Cloud computing can help!!!! • Elastic, pay-as-you go resource provisioning • Suitable for distributed scalable applications • Can we take advantage of elasticity in an automated, fine-tuned application agnostic manner? Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  3. Motivation – the story (2) • NoSQL • Non-relational • Horizontal scalable • Distributed • Open source • And often: • schema-free, easily replicated, simple API, eventually consistent /(not ACID), big-data-friendly, etc • Many, many, implementations… • Currently around 150 see http://nosql-database.org/ Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  4. NoSQLs and elasticity • Column family • Hbase, Cassandra, … • Document store • CouchDB, mongoDB, … • Key-Value store • Riak, Dynamo, Voldemort, … • Many offer elasticity+sharding: • Expand/contract resources according to demand • Shared-nothing architecture allows that • Pay-as-you-go, robustness, performance • Manual and simple threshold-based elastic actions are suboptimal • Important! See Apr 2011 Amazon outage (foursquare, reddit,…) Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  5. thus…(end of the story) • PaaS and NoSQLs are (or should be) inherently elastic • How efficiently do they implement elasticity? • NoSQLs over an IaaS platform • EC2, Eucalyptus, OpenStack,… • We need a study that registers qualitative + quantitative results • Can we automate the elasticity procedure? • For any NoSQL, any user-given policy, any metric of interest? • Can we bundle everything together? • The Tiramola system Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  6. Contributions • Tiramola, a generic modular system able to manage • anyNoSQL engine • User-defined policies • Automatic resource provisioning using IaaS clouds • Module for NoSQL cluster monitoring • Real time reporting of client, application and general purpose metrics • Decision making module for automatic elasticity • Implementation as a Markov Decision Process • Continuously identifies the best scaling action according to observed system state Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  7. Contribution side-effects • Coding + infrastructure • Open source python code (GFOSS + google code) • http://tiramola.googlecode.com • Using cloud-based client tools, platform-agnostic • Euca2ools guarantee execution in numerous cloud platforms • YCSB clients • Cassandra, Hbase implementation • almost Voldemort, Riak Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  8. Related work • Elastic Nimbus [13], AutoScale [14], SmartSla [15], iBalloon [17], Kingfisher [18] • Do not handle NoSQL systems • Do not address dynamic cluster resizes • [19] elastic HDFS • Specific HDFS implementation • Not fine-grained policy support, No support for auto-learning • Cloudy [20], Nefeli [21], Microeconomics [23] • [20] Provides only support for scaling • [21] requires software installation at the cloud vendor • [22] classic microeconomics for profit maximization, no support for fine-grained policy definitions • Systems like Autoscaling [24], RightScale [26], Scalr [27] • Commercial: vendor-lock in • Limited metric support, primitive decision making policies Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  9. Tiramola architecture overview Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  10. Tiramola monitoring module • Ganglia tool • Suitable for clusters • Easy to install/maintain • Low overhead- UDP • Metrics • General purpose like CPU, RAM • App-specific and user metrics: gmetric spoofing Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  11. Tiramola cloud management module • Translates resize actions • Contacts IaaS provider • Acquires or releases cloud resources • Euca2tools • EC2 compliant • Support for any cloud management • Precooked VM images • Ready to launch AMIs • Contain all necessary NoSQL libraries Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  12. Tiramola cluster coordinator module • Operates when cloud management finishes resource (de-)allocation • Orchestrates NoSQL • When resources change • Implements specific NoSQL resizing actions • Remote execution of shell scripts • Injection of new config files • Current support for: • HBase, Cassandra, Riak Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  13. Tiramola core: Decision-Making module • Formulation as an MDP: • {S, A, {Piαj}, γ,Riαj} • Identify • Exp. Sum of rew. • State = # running VMs • Actions = {add-n, rem-n, no-op} • P: Transition probabilities going from state i to j • Reward Function R – sets the policy for the resize • Immediate gain for going to state s • R(s) = f(gains, costs) • Thus optim. Value f: (Bellman’s equation) • System of equations, optimal policy greedy w.r.t. V Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  14. Large Degree of freedom • MDP allows for optimal solution without previous knowledge • The system learns from previous experience, by “exploring” permissible states • Learning is real-time and continues during execution • Decision making is auto-tuned, reacting to environment and reward changes. • Generic applicable • Arbitrary reward functions allow for virtually any optimization policy Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  15. Estimate R(s) for possible transitions • How to know exact R(s) for all s, without making the transition? • Assume “deterministic”, reliable cluster behavior • Similar input ⇨ similar performance metrics • Idea: cluster measurements around current load • r(s)=f(latency, VMs) • Add 2 extra dimensions: #VMs, load • Find, for all permissible transitions, what latency would be Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  16. Architectural considerations • Robustness • Daemon process that checkpoints and can be restarted • State is provided from the IaaS Cloud and the Monitoring module. • Applicable timeouts (not realtime systems!) • Modularity • Different interchangeable components • APIs that utilize primitives (NoSQL and Policies) • Expandability • Speed (irrelevant in most cases) • Written in Python Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  17. Platform Setup • Cloud Cluster • Private OpenStack cactus cluster • A total of 16 server VMs • Similar to an Amazon EC2 large instance • 4-core processor, 8GB RAM, 50GB disk space • A total of 20 client VMs • 2-core processor, 4GB RAM, 50GB disk space • Storage • QCOW image: 1.6GB compressed, 4.3GB uncompressed • VM root fs instead of EBS (Reddit outage) Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  18. Clients, Data and Workloads • Hbase (v. 0.20.6), Cassandra (v. 0.7.0 beta) • Hadoop 0.2.20 • 8 initial nodes (VMs) • Ganglia 3.1.2 • YCSB tool • Database: 20M objects – 20GB raw (Cass ~60GB, Hbase ~90GB) • Loads: UNI_R, UNI_U, UNI_RMW, ZIPF_R • Default: uniform read, 10%-50% range • Both client (YCSB) and cluster (Ganglia) metrics reported Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  19. 1. Metrics affected • Max throughput • HBaseμmax=80Kreqs/sec @ λ=80Kreqs/sec • Cassandra μmax=13Kreqs/sec @ λ=20Kreqs/sec • Max total cluster CPU usage • HBaseCPUmax=55% • Cassandra CPUmax=76% • Further λ increase has no effect • Systems are fully utilized and requests are queued • Stress systems by increasing client request λ • UNI_R workload • Identify critical operation points • 8-node cluster, no resize Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  20. The setup • λstart=180K reqs/sec (way over critical point) • At time t=370sec: • Double the cluster size (add extra 8 nodes) • Triple the cluster size (add extra 16 nodes) • 4 different experiments • READ+8, READ+16, UPDATE+8 and UPDATE+16 • Measure client, cluster-side metrics • query latency, throughput and total cluster usage • Vs time Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  21. 2. HBase cluster resize • For READ, throughput is (roughly) doubled and tripled • More servers handle more requests • Data is not transferred, but it is cached • Update is not affected • I/O bound operation • Updates will be handled by the initial 8 nodes • Only new regions due to compaction will be served by new nodes Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  22. Evaluating Tiramola – setup • Initial state: S4, 1 to 16 VMs cluster size allowed • Sinusoid-like READ loads – vary peak, periodicity • Alter YCSB clients • Reward functions (proof of concept) • r1(s) = -C∙|VMs| • r2(s) = B ∙ thr • r3(s) = B ∙ thr - C ∙ |VM|2 • r4(s) = B ∙ thr - C ∙ |VMs| - A ∙ lat. • Monitor every min, decide every 10 min, 5 min backoff • Training Period for initial data points • Not necessary, but allows quicker “correct” decisions Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  23. Evaluating Tiramola – 1 • r1, r2 • r3, r4 Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  24. Evaluating Tiramola – 2 • Different amplitude • Different periodicity Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  25. Evaluating Tiramola – 3 • Initial training set close to applied load • Initial training set very different from applied load Max training Min training Max training Min training Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

  26. Questions ? • “TIRAMOLA: Elastic NoSQL Provisioning through A Cloud Management Platform” – SIGMOD 2012(Demo Track) • “Automatic Scaling of Selective SPARQL Joins Using the TIRAMOLA System” – SWIM 2012 • “On the Elasticity of NoSQL Databases over Cloud Management Platforms” – CIKM 2011 • “Elastic NoSQL databases over the Cloud” – ΕΛ/ΛΑΚ 2011 • CELAR: “Automatic, multi-grained elasticity-provisioning for the Cloud” – FP7 • http://www.celarcloud.eu/ • http://tiramola.googlecode.com Automated, Elastic Resource Provisioning for NoSQL Clusters Using TIRAMOLA - I. Konstantinou

More Related