1 / 30

DataGarage : Warehousing Massive Performance Data on Commodity Servers

DataGarage : Warehousing Massive Performance Data on Commodity Servers. Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation. Monitoring Large DataCenters. Management Task. Monitoring Planning Historical analysis. CPU, memory, disk utilization,… Response time, queue length,…

cyndi
Download Presentation

DataGarage : Warehousing Massive Performance Data on Commodity Servers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DataGarage: Warehousing Massive Performance Dataon Commodity Servers Charles Loboz, Slawek Smyl, Suman Nath Microsoft Corporation

  2. Monitoring Large DataCenters Management Task Monitoring Planning Historical analysis CPU, memory, disk utilization,… Response time, queue length,… Performance data Context Performance Data  Design Goals DataGarage Query Processing Experiments

  3. Monitoring Data Management 100K servers = 1TB data per day! Storage challenge Query challenge Store data over many months, years Petabytes of data Hours to run simple queries Context Performance Data  Design Goals DataGarage Query Processing Experiments

  4. DataGarage DataGarage Performance data warehousing system Storage, query processing Efficient, scalable, cheap CPU, memory, disk utilization,… Response time, queue length,… Performance data Context Performance Data  Design Goals DataGarage Query Processing Experiments

  5. Outline • Context • Performance data characteristics • Design goals • DataGarage design • Query Processing • Evaluation • Conclusion

  6. Performance Data Collection Our Deployment Sampling period 15 seconds 100-1000 counters/server 5-100 MB/server/day 0.01% CPU time Monitoring process CPU utilization, memory usage, disk space, SQL queue length, app response time, cache hit rate, network bandwidth, … Context Performance Data  Design Goals DataGarage Query Processing Experiments

  7. Performance Data Characteristics • Heterogeneous counter sets • 30K different counters, 100-1000 per server • Numeric, read-only, possibly-dirty • Dirty data retained, may be ignored for query • Hierarchical queries • Selection, projection, aggregation, data mining • Fraction of hotmail.com servers in a given rack with CPU utilization > 50% • Average memory utilization trend of hotmail servers Context Performance Data  Design Goals DataGarage Query Processing Experiments

  8. DataGarage Design Goals • Small storage footprint • Reduces storage and communication cost • Small pay-as-you-go cost for Cloud systems • Cheap • Commodity hardware and off-the-shelf software • Fast and robust query processing • Allows fast decisions • Tolerates faulty and slow hardware • Simple and flexible query interface (SQL + UDF) • Fast query writing Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  9. Outline • Context • Performance data characteristics • Design goals • DataGarage design • Query Processing • Evaluation • Conclusion

  10. Options • TableStore: Relational table • DB engine: single-node DBMS, parallel DBMS • MapReduce: HadoopDB [Abouzeid et al. VLDB’09] • FileStore: Files • MapReduce: Hadoop, Dryad [Isard et al., EuroSys’07] Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  11. Trade-offs Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  12. Storage Inefficiency: TableStore Key problem: heterogeneous counter sets Total 30,000 unique counters, <1000/server Wide table Narrow table Machine id Timestamps Counter 1 Counter 2 Counter n Machine id Timestamps Counter id Value All possible counters Key-value store • Too many columns • >95% sparse • Redundant keys(4x more expensive than raw data) • Expensive joins needed Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  13. Storage Inefficiency: FileStore • Heterogeneous counter sets • Files need to maintain schema for each server • No structure in data • Compression cannot exploit data correlation Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  14. Our Solution • One wide-table per server • Benefits of TableStore, without sparseness/ redundancy • Each wide-table in an embedded database file • Benefits of FileStore SQL Lite, MS SQL Server Compact Edition .sdf file File system Microsoft SQL Server Compact Edition library Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  15. DataGarage Architecture Data analysis tools Controller (Query Dissemination) Query Distributed file system Summary Database Embedded database Data collector Data collector Data collector Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  16. Data Compression • Zipping files with PKZip is not effective • Compress one column at a time • Exploit strong correlation • RLE, delta encoding not very effective • Our idea: Bit-truncation + Byte-interleaving 42 42 42 42 AE AE AE AE 91 83 2B 39 A0 E4 38 C4 … … … 42 42 42 42 AE AE AE AE 91 83 2B 39 42 42 AE .. 42 .. AE 91 42 AE AE 83 if lossy <1% … … … Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  17. Storage Efficiency Context  Performance Data  Design Goals DataGarage Query Processing Results

  18. Outline • Context • Performance data characteristics • Design goals • DataGarage design • Query Processing • Evaluation • Conclusion

  19. DataGarage Query • DataGarage query: Three components • On: filesystem path: /hotmail/dc1/*.10-.-2009.sdf • Apply: a SQL query run on individual database files • Combine: a SQL query to compute final result • Enables map-reduce style execution Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  20. Query Execution Apply Controller Node Dissemination On Result Controller Combine Apply Apply Apply Apply Combine Apply Apply Execution Nodes … Distributed File system Temporary Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  21. Query Execution Time Context  Performance Data  Design Goals DataGarage Query Processing Results

  22. Fault Tolerance • DataGarage key technology: • Decoupling of execution and storage • Fine-grained data partitioning • Data is replicated by the file system • Slow execution nodes • Assigned smaller jobs • Faster nodes take additional load after finished • Execution node failures • New nodes work on remaining job of failed nodes Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  23. Goals Revisited • High performance: queries are pushed inside embedded database • Storage efficient: compression • Fault tolerant: fine partitioning of data and query processing, aggressive restarting, speculative execution • Hierarchical queries: file system paths • Simple interface: SQL queries • Cheap: off-the-shelf tools, commodity machines

  24. Outline • Context • Performance data characteristics • Design goals • DataGarage design • Query Processing • Experience • Conclusion

  25. Operational Experience • Have been in operation for more than 1 year • Warehousing data from Microsoft data centers • Partitioning with fine granularity + compression is the key to store massive data • Previous implementation with narrow table • 30K server-days in 1TB disk • Slow queries • Current implementation: • 1-3 million server-days/TB • Orders of magnitude faster queries Context  Performance Data  Design Goals DataGarage Query Processing Results

  26. Operational Experience • Embedded database files give flexibility • Placement, backup simplified • Scavenge available storage on the fly • Simple design helps • Several thousands lines of C# code to glue together existing tools (FS, Embedded DB, R, …) • Defer features until necessary: Parallel Combine • Good fit with Cloud computing model • Data and/or computation can be on the Cloud • Cheap: only file storage needed, small footprint Context  Performance Data  Design Goals DataGarage Query Processing Results

  27. Conclusion • Existing solutions are not efficient for warehousing performance data • DataGarage: performance data warehouse • Cheap, scalable, fault tolerant • Combines benefits of DB, MapReduce, file systems • Operational experience shows the benefits Questions? Context  Performance Data  Design Goals DataGarage Query Processing Results

  28. Compression Overhead Context  Performance Data  Design Goals DataGarage Query Processing Results

  29. Related Work • HadoopDB • DataGarage has finer data partitioning • Improves fault tolerance and storage efficiency • DataGarage uses embedded databases • Cheap, enables using hierarchical file system • DataGarage uses data compression Context  Performance Data  Design Goals DataGarage Query Processing Experiments

  30. Query Processing <apply_script> Controller (Query Dissemination) <target> Result <combine_script> <combine_script> Temporary table <apply_script> <apply_script> Embedded database Distributed file system Context  Performance Data  Design Goals DataGarage Query Processing Experiments

More Related