1 / 21

Sameer Shende, Allen D. Malony, Robert Ansell-Bell {sameer,malony,bertie}@cs.uoregon

Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation. Sameer Shende, Allen D. Malony, Robert Ansell-Bell {sameer,malony,bertie}@cs.uoregon.edu Computer & Information Science Department Computational Science Institute University of Oregon.

presnell
Download Presentation

Sameer Shende, Allen D. Malony, Robert Ansell-Bell {sameer,malony,bertie}@cs.uoregon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation Sameer Shende, Allen D. Malony, Robert Ansell-Bell {sameer,malony,bertie}@cs.uoregon.edu Computer & Information Science Department Computational Science Institute University of Oregon

  2. Empirical Performance Technology • Evolution of parallel systems challenge empirical performance evaluation • Shared- and distributed-memory parallelism • Layered, hierarchical software environments • Multi-level performance semantics • Dual performance technology goals • Robust performance observation • Semantic-based performance mapping • Strategies for instrumentation and measurement • Flexibility • Portability

  3. Talk Outline • Performance Observation Requirements • Flexibility and Portability • Strategies for Empirical Performance Evaluation • TAU Performance System • Computation model for performance technology • TAU performance system toolkit • Performance Case Study • Conclusions

  4. TAU Performance System • Tuning and Analysis Utilities • Performance system framework • scalable parallel and distributed HPC • Targets a general complex system computation model • nodes / contexts / threads • Multi-level: system / software / parallelism • Measurement and analysis abstraction • Integrated performance toolkit • instrumentation, measurement, analysis, visualization • Portable facility based on open software approach • Robust and widely applied

  5. General Complex System Computation Model • Node:physically distinct shared memory machine • Message passing node interconnection network • Context: distinct virtual memory space within node • Thread: execution threads (user/system) in context Interconnection Network Inter-node messagecommunication * * Node Node Node node memory memory memory SMP physicalview VM space …    modelview … Context Threads

  6. TAU Performance System Framework

  7. Empirical Performance Experimentation Space • Wherein the program are performance measurements made • When is performance instrumentation done • How are performance measurements defined and how are instrumentation alternatives chosen

  8. Instrumentation Alternatives • Source-to-source translation using preprocessor level instrumentation • PDT • MPI wrapper library level instrumentation • VampirTrace • Binary instrumentation using runtime code patching • DyninstAPI

  9. Measurement Strategies • Statistical profiles of software actions • timing or counting (sampled or direct methods) • Statistical profiles of hardware actions • hardware performance data • Program event tracing • temporal dynamic behavior

  10. Measurement Alternatives • Wallclock time • gettimeofday (default) • low-overhead nanosecond timers [PAPI] • CPU time (user+sys) • Process virtual time (user) • Hardware performance counters [PCL, PAPI] • floating point instructions • primary and secondary data and instruction cache misses ...

  11. Runtime Instrumentation and Measurement DyninstAPI+TAU (Wallclock, PAPI wallclock, FP)

  12. Dynamic Instrumentation • TAU uses DyninstAPI for runtime code patching • tau_run (mutator) loads measurement library • Instruments mutatee • MPI issues: • one mutator per executable image [TAU, DynaProf] • one mutator for several executables [Paradyn, DPCL]

  13. Measurement Alternatives DyninstAPI+TAU (Event Tracing, CPUTIME profile)

  14. SIMPLE Hydrodynamics Benchmark

  15. Profiling using Multi-Level Instrumentation PDT (source) and MPI (library)

  16. Event Tracing using DyninstAPI

  17. Tracing with Source and Library Instrumentation

  18. Profiling with Library Level Instrumentation

  19. Performance Perturbation • Measurement alternatives • PAPI wallclock overhead 27% lower than gettimeofday system call under IA-32 Linux 2.x • Source vs. runtime instrumentation • source 23% lower than runtime for TAU profiling • Need to balance alternatives • abstractions • instrumentation levels • flexibility /simplicity

  20. Conclusions Flexibility and portability of performance technology can be improved by integration of instrumentation and measurement strategies. This helps create robust and ubiquitous performance technology for the analysis and tuning of parallel and distributed software and systems in the presence of (evolving) complexity.

  21. More Information and Acknowledgments • URLs • TAU:www.cs.uoregon.edu/research/paracomp/tau • Grant support (TAU) • DOE 2000 ACTS • http://www-unix.mcs.anl.gov/DOE2000 • http://www.nersc.gov/ACTS • DOE ASCI Level 3 (LANL, LLNL) • DARPA

More Related