1 / 20

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications. Hyoun Kyu Cho 1 , Tipp Moseley 2 , Richard Hank 2 , Derek Bruening 2 , Scott Mahlke 1. 1 University of Michigan 2 Google. Datacenter Applications. http://googleblog.blogspot.com.

ganya
Download Presentation

Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instant Profiling: Instrumentation Sampling for Profiling Datacenter Applications HyounKyu Cho1, Tipp Moseley2, Richard Hank2, Derek Bruening2, Scott Mahlke1 1University of Michigan 2Google

  2. Datacenter Applications http://googleblog.blogspot.com • In 2010, US Datacenters spent 70~90 billion kWh* • Datacenter application performance is critical • Profiling can help *[Koomey`11]

  3. Traditional Profiling Source Code • Challenges for Datacenters • Need to run on live traffic • Difficult to isolate • Overheads • Value profiling 3.8x slowdown1 • Path profiling 31%, edge profiling 16%2 • Binary management • Many programs, multiple versions Instrumentation Build Instrumented Binary Input Data Training Run Profile Data 1[Calder`99] 2[Ball`96]

  4. Google-Wide Profiling • Continuous profiling infrastructure for datacenters • Negligible overhead • Sampling based • Aggregated profiling overhead less than 0.01% • Limitations • Heavily rely on Performance Monitoring Units • Limited flexibility and portabiliity [Ren et al.`10]

  5. Goals • Unified profiling infrastructure for datacenters • Flexible types of profile data • Portable across heterogeneous datacenter • While maintaining • Low overhead • Does not burden binary management Dynamic Binary Instrumentation Sampling

  6. Instrumentation Sampling application system call gateway operating system hardware

  7. Instrumentation Sampling application dispatch instrumentation engine client context switch operating system code cache DynamoRIO hardware [Bruening`04]

  8. Instrumentation Sampling application shepherding thread dispatch instrumentation engine client start profiling operating system code cache stop profiling hardware

  9. Problems with Basic Implementation • Unbounded profiling periods due to fragment linking • Latency degradation due to initial instrumentation • Multi-threade programs

  10. Temporal Unlinking/Relinking of Fragments context switch code cache BB1 dispatch BB2 BB2->BB1

  11. S/W Code Cache Pre-population application shepherding thread • Still have latency degradation for intial instrumentation phases dispatch instrumentation engine client operating system code cache hardware

  12. Multithreaded Program Support • Sampling makes it possible to miss thread operations • Forces Instant Profiling’s signal handler for every thread • Enumerates all threads and sends profiling start signal to each thread

  13. Experimental Setup • 6-core Intel Xeon 2.67GHz w/ 12MB L3 • 12GB main memory • Linux kernel 2.6.32 • gcc 4.4.3 w/ -O3 • SPEC INT2006, BigTable, Web search • Edge profiling client

  14. Naïve Edge Profiling

  15. Profiling Overhead

  16. S/W Code Cache Prepopulation

  17. Profiling Accuracy

  18. Asymptotic Accuracy

  19. Conclusion • Low-overhead, portable, flexible profiling needed • Instant Profiling • Combines sampling and DBI • Pre-populates S/W code cache • Tunable tradeoff between overhead and information • Provides eventual profiling accuracy • Less than 5% overhead, more than 80% accuracy for naïve edge profiling client

  20. Thank you!

More Related