270 likes | 419 Views
Dealer: Application-aware Request Splitting for Interactive Cloud Applications. Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz (Microsoft), Sanjay Rao (Purdue) and Kunwadee Sripanidkulchai ( NECTEC Thailand).
E N D
Dealer: Application-aware Request Splitting for Interactive Cloud Applications Mohammad Hajjat Purdue University Joint work with: Shankar P N (Purdue), David Maltz(Microsoft), Sanjay Rao(Purdue)and KunwadeeSripanidkulchai(NECTEC Thailand)
Performance of Interactive Applications = • Interactive apps stringent requirements on user response time • Amazon: every 100mslatency cost 1% in sales • Google: 0.5 sec’s delay increase traffic and revenue drop by 20% • Tail importance: SLA’s defined on 90%ile and higher response time
Cloud Computing: Benefits and Challenges • Benefits: • Elasticity • Cost-savings • Geo-distribution: • Service resilience, disaster recovery, better user experience, etc. • Challenges: • Performance is variable: [Ballani’11], [ Wang’10], [Li’10], [Mangot’09], etc. • Even worse, data-centers fail Average of $5,600 per minute!
Approaches for Handling Cloud Performance Variability Autoscaling? • Can’t tackle storage problems, network congestion, etc. • Slow: tens of mins in public clouds DNS-based and Server-based Redirection? • Overload remote DC • Waste local resources • DNS-based schemes may take hours to react
Contributions • Introduce Dealerto help interactive multi-tier applications respond to transient variability in performance in cloud • Split requests at component granularity (rather entire DC) Pick best combination of component replicas (potentially across multiple DC’s) to serve each individual request • Benefits over naïve approaches: • Wide range of variability in cloud (performance problems, network congestion, workload spikes, failures, etc.) • Short time scale adaptation (tens of seconds to few minutes) • Performance tail (90th percentile and higher): • Under natural cloud dynamics > 6x • Redirection schemes: e.g., DNS-based load-balancers > 3x
Outline • Introduction • Measurement and Observations • System Design • Evaluation
Role WorkerRole WorkerRole ebRole IIS orkerRole orkerRole IIS Queue FE Performance Variability inMulti-tier Interactive Applications Web Role Worker Role IIS Load Balancer Thumbnail Application • Multi-tier apps may consist of hundreds of components • Deploy each app on 2 DC’s simultaneously BL1 blob blob BE BL2 Worker Role Queue
Performance Variability inMulti-tier Interactive Applications Outliers FE BL1 75th BL2 BE median 25th
Observations • Replicas of a component are uncorrelated • Few components show poor performance at any time • Performance problems are short-lived; 90% < 4 mins FE FE BL1 BL1 BL2 BL2 BE BE
Outline • Introduction • Measurement and Observations • System Design • Evaluation
Dealer Approach:Per-Component Re-routing • Split req’s at each component dynamically • Serve each req using a combination of replicas across multiple DC’s C3 C3 Cn Cn C1 C1 C2 C2 C4 C4 GTM
Dealer System Overview C3 C3 C3 C1 C1 C1 C2 C2 C2 Cn Cn Cn GTM Dealer
Dealer High Level Design Compute Split-Ratios Determine Delays Application Stability Dynamic Capacity Estimation
Determining Delays • Monitoring: • Instrument apps to record: • Component processing time • Inter-component delay • Use X-Trace for instrumentation, uses global ID • Automate integration using Aspect Oriented Programming (AOP) • Push logs asynchronously to reduce overhead • Active Probing: • Send req’s along lightly used links and comps • Use workload generators (e.g., Grinder) • Heuristics for faster recovery by biasing towards better paths Compute Split-Ratios Determine Delays Application Stability Dynamic Capacity Estimation
Determining Delays Monitoring Probing Combine Estimates Stability & Smoothing • Delay matrix D[,]: component processing and inter-component communication delay • Transaction matrix T[,]: transactions rate between components
Calculating Split Ratios FE BL1 Compute Split-Ratios Determine Delays BL1 C42 C41 Application C4 C12 user C11 C22 FE C21 C32 C31 BE BL2 C1 C2 C3 C52 C51 BL2 BE Stability C5 Dynamic Capacity Estimation
Calculating Split Ratios C41 C11 C21 C31 • Given: • Delay matrix D[im, jn] • Transaction matrix T[i,j] • Capacity matrix C[i,m] (capacity of component i in data-center m) • Goal: • Find Split-ratios TF[im, jn]: # of transactions between each pair of components Cim and Cjns.t. overall delay is minimized C51 C42 C12 C22 C32 C52 • Algorithm: greedy algorithm that assigns requests to the best performing combination of replicas (across DC’s)
Other Design Aspects • Dynamic Capacity Estimation: • Develop algorithm to dynamically capture capacities of comps • Prevent comps getting overloaded by re-routed traffic • Stability: multiple levels: • Smooth matrices with Weighted Moving Average (WMA) • Damp Split-Ratios by a factor to avoid abrupt shifts • Integration with Apps: • Can be integrated with any app (stateful; e.g., StockTrader) • Provide generic pull/push API’s Compute Split-Ratios Determine Delays Application Stability Dynamic Capacity Estimation Stability Dynamic Capacity Estimation
Outline • Introduction • Measurement and Observations • System Design • Evaluation
Evaluation • Real multi-tier, interactive apps: • Thumbnails: photo processing, data-intensive • Stocktrader: stock trading, delay-sensitive, stateful • 2 Azure datacenters in US • Workload: • Real workload trace from big campus ERP app • DaCapo benchmark • Comparison with existing schemes: • DNS-based redirection • Server-based redirection • Performance variability scenarios (single fault domain failure, storage latency, transaction mix change, etc.)
Running In the Wild • Evaluate Dealer under natural cloud dynamics • Explore inherent performance variability in cloud environments More than 6x difference
Running In the Wild BL FE BE BL FE BE BL A B
Dealer vs. GTM • Global Traffic Managers (GTM’s) use DNS to route user IP’s to closest DC • Best performing DC ≠ closest DC (measured by RTT) • Results: more than 3x improvement for 90th percentile and higher
Dealer vs. Server-level Redirection • Re-route entire request, granularity of DC’s HTTP 302 DCA DCB
Evaluating Against Server-level Redirection BL1 FE BE BL2 BL1 FE BE BL2
Conclusions • Dealer: novel technique to handle cloud variability in multi-tier interactive apps • Per-component re-routing: dynamically split user req’s across replicas in multiple DC’s at component granularity • Transient cloud variability: performance problems in cloud services, workload spikes, failures, etc. • Short time scale adaptation: tens of seconds to few mins • Performance tail improvement: • Natural cloud dynamics > 6x • Coarse-grain Redirection: e.g., DNS-based GTM > 3x