Host Load Prediction in a Google Compute Cloud with a Bayesian Model

Host Load Prediction in a Google Compute Cloud with a Bayesian Model Sheng Di1, Derrick Kondo1, Walfredo Cirne2 2Google 1INRIA

Outline • Motivation of Load Prediction • Google Load Measurements & Characterization • Pattern Prediction Formulation • Exponential Segmented Pattern (ESP) Prediction • Transformation of Pattern Prediction • Mean Load Prediction based on Bayes Model • Bayes Classifier • Features of Load Fluctuation • Evaluation of Prediction Effect • Conclusion

Motivation (Who needs Load Prediction) • From the perspective of on-demand allocation • User’s resources/QoS are sensitive to host load. • From the perspective of system performance • Stable load vs. Unstable load: System is best to run in a load balancing state, where the load burst can be released asap. • From the perspective of Green computing • Resource Consolidation: Shutting down idle machines can save electricity cost.

Google Load Measurements & Characterization • Overview of Google trace • Google released one-month trace in Nov. of 2011 (40G disk space). • 10,000+ machines @ Google • 670,000 jobs, 25 million tasks in total • Task: the basic resource consumption unit • Job: a logic computation object that contains one or more tasks

> 20 times Google Load Measurements & Characterization • Load Comparison between Google and Grid (GWA) • Google host load fluctuates with higher noises min noise / mean noise / max noise • Google: 0.00024, 0.028, 0.081 • AuverGrid: 0.00008, 0.0011, 0.0026

Pattern Prediction Formulation • Exponentially Segmented Pattern (ESP) • The hostload fluctuation over a period is split into a set of consecutive segments, whose lengths increase exponentially. • We predict the mean load over each time segment: l1, l2, ….. (Evidence window)

Pattern Prediction Formulation (Cont’d) • Reduction of ESP Prediction Problem • Idea: Get Segmented Levels (li) always from the mean load (denoted as ηi ) during [t0, ti ] • We can get li , based on t0, (ti-1,ηi-1), (ti,ηi) • Two key steps in the Pattern Prediction Algorithm • Predict mean values with b2k lengths from current point • Transform the set of mean load prediction to ESP t1 t2 t3 t4 t0 Time Series Current time point

Traditional Approaches to Mean Load Prediction • Can Feedback Control Model work? NO • Example: Kalman Filter • Reason: one-step look-ahead prediction doesn’t fit our long-interval prediction goal. • Can we use short-term prediction error to instruct long-term prediction feed-back? NO • Can the traditional methods like Linear Model fit Google host load prediction? • Such as Simple Moving Average, Auto-Regression (AR), etc. 16 hours 16 hours

Mean Load Prediction based on Bayes Model • Principle of Bayes Model (Why Bayes?) • We strongly believe the correctness of probability • Posterior Probability rather than Prior Probability • Naïve Bayes Classifier (N-BC) • Predicted Value: • Minimized MSE Bayes Classifier (MMSE-BC) • Predicted Value:

Why do we use Bayes Model? • Special Advantages of Bayes Model • Bayes Method can • effectively retain important features about load fluctuation and noises, rather than ignoring them. • dynamically improve prediction accuracy, with more accurate probability updated based on increasing samples. • estimate the future with low computationcomplexity, due to quick probability calculation. • only take limited disk space since it just needs to keep/update corresponding probability values

Key Point: How to extract features in Evidence Window? Mean Load Prediction based on Bayes Model • Implementation of Bayes Classifier • EvidenceWindow: an interval until current moment • States of Mean Load: for prediction interval • r states (e.g., r = 50 means there are 50 mean load states to predict: [0,0.02), [0.02,0.04),……, [0.98,1] )

Mean Load Prediction based on Bayes Model • Features of Hostload in Evidence Window • Mean Load State (Fml(e)) • Weighted Mean Load State (Fwml(e)) • Fairness index (Ffi(e))

eg. α = 4 β=8 β=7 β=1 β=6 β=2 β=5 β=4 β=3 Mean Load Prediction based on Bayes Model • Noise-decreased fairness index (Fndfi(e)) • Load outliers are kicked out • Type state (Fts(e)): for degree of jitter • Representation: (α, β) • α= # of types (or # of state levels) • β= # of state changes eg. β = 8 0.10 0.08 0.06 0.04 0.02 0.00 Prediction Interval

F4-sp(e) F3-sp(e) F2-sp(e) Mean Load Prediction based on Bayes Model • First-Last Load (Ffll(e)) • = {first load level, last load level } • N-segment Pattern (FN-sp(e)) • F2-sp(e): {0.01, 0.03} • F3-sp(e): {0.02, 0.04, 0.04} • F4-sp(e): {0.02, 0.02, 0.05, 0.05} 0.10 0.08 0.06 0.04 0.02 0.00 Prediction Interval

Mean Load Prediction based on Bayes Model • Correlation of Features • Linear Correlation Coefficient • Rank Correlation Coefficient

Mean Load Prediction based on Bayes Model • Compatibility of Features • Four Groups split: {Fml, Fwml, F2-sp, F3-sp, F4-sp} , {Ffi, Fndfi} , {Fts} , {Ffll} • Total Number of Compatible Combinations:

Evaluation of Prediction Effect (Cont’d) • List of well-known load prediction methods • Simple Moving Average • Mean Value in the Evidence Window (EW) • Linear Weighted Moving Average • Linear Weighted Moving Average Value in the EW • Exponential Moving Average • Last-State • use last state in the EW as the prediction value • Prior Probability • the value with highest prior probability • Auto-Regression (AR): Improved Recursive AR • Hybrid Model [27]: Kalman filter + SG filter + AR

Evaluation of Prediction Effect (Cont’d) • Training and Evaluation • Evaluation Type A: the case with insufficient samples • Training Period: [day 1, day 25]: only 18,000 load samples • Test Period: [day 26, day 29] • Evaluation Type B: ideal case with sufficient samples • Training Period : [day 1, day 29]: emulation of larger set of samples • Test Period: [day 26, day 29]

Evaluation of Prediction Effect (Cont’d) • Evaluation Metrics for Accuracy • Mean Squared Error (MSE) where are true mean values and • Success Rate (delta of 10%) in the test period success rate = Number of Accurate Predictions Total Number of Predictions

Evaluation of Prediction Effect (Cont’d) • Exploration of Best Feature Combination (Success Rate): Evaluation Type A • Representation of Feature Combinations • 101000000 denotes the combination of the mean load feature and fairness index feature (a) s = 3.2 hour (b) s = 6.4 hour (c) s = 12.8 hour

Evaluation of Prediction Effect (Cont’d) • Exploration of Best Feature Combination (Mean Squared Error) (a) s = 3.2 hour (b) s = 6.4 hour (c) s = 12.8 hour

Evaluation of Prediction Effect (Cont’d) • Comparison of Mean Load Prediction Methods (Success Rate of CPU load w.r.t. Evaluation Type A) (a) s = 6.4 hour (b) s = 12.8 hour

Evaluation of Prediction Effect (Cont’d) • Comparison of Mean Load Prediction Methods (MSE of CPU load w.r.t. Evaluation Type A) (a) s = 6.4 hour (b) s = 12.8 hour

Evaluation of Prediction Effect (Cont’d) • Comparison of Mean-Load Prediction Methods (CPU load w.r.t. Evaluation Type B) Best feature Combination • mean load • fairness index • type-state • first-last

Evaluation of Prediction Effect (Cont’d) • Evaluation of Pattern Prediction Effect • Mean Error & Mean MSE • Mean Error:

Evaluation of Prediction Effect (Cont’d) • Evaluation of Pattern Prediction Effect • Snapshot of Pattern Prediction (Evaluate Type A)

Conclusion • Objective: predict ESP of host load fluctuation • Two-Step Algorithm • Mean Load Prediction for the exponential interval from the current moment • Transformation to ESP • Bayes Model (for Mean Load Prediction) • Exploration of best-fit combination of features • Comparison with 7 other well-known methods • Use Google Trace in the experiment • Evaluation type A: Bayes Model ({Fml}) outperforms others by 5.6-50% • Evaluation type B: {Fml,Ffi,Fts,Ffll} is the best combination. • MSE of Pattern Predictions: majority are in [10-8, 10-5]

ThanksQuestions?

Host Load Prediction in a Google Compute Cloud with a Bayesian Model

Host Load Prediction in a Google Compute Cloud with a Bayesian Model

Presentation Transcript

A Bayesian, Meta Cost-Benefit Model

Dynamic Load Management of Virtual Machines in a Cloud Architectures

Analytics: a Model for prediction

A Cloud in a Nutshell

Fog prediction in a 3D model with parameterized microphysics

A Novel MHCp Binding Prediction Model

A New Model for the Cloud

Ocean Ecosystem Model Parameter Estimation in a Bayesian Hierarchical Model (BHM)

A pattern fusion model for multi-step-ahead CPU load prediction

Cura : A Cost-optimized Model for MapReduce in a Cloud

Getting Started with Oracle Compute Cloud

A Bayesian Network Model of Stromatolite Formation

A Prediction

Google Compute Engine Customers List

google cloud

HTCondor with Google Cloud Platform

An Evaluation of Linear Models for Host Load Prediction

Modeling Host Load

Trying to build a model to compute I-Vs

Google Cloud Platform Website Hosting | How To Host Website On Google Cloud | Simplilearn