1 / 39

Paradyn

Paradyn. Paradyn Goals. Performance measurement tool that scales to long-running programs on large parallel and distributed systems automates much of the search for performance bottlenecks avoids space and time overhead of trace-based tools. Paradyn Approach.

york
Download Presentation

Paradyn

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paradyn

  2. Paradyn Goals • Performance measurement tool that • scales to long-running programs on large parallel and distributed systems • automates much of the search for performance bottlenecks • avoids space and time overhead of trace-based tools

  3. Paradyn Approach • Dynamically instrument application • Automatically control instrumentation in search of performance problems • Look for high level problems (e.g., too much synchronization blocking, I/O blocking, or memory delays) using small amount of instrumentation • Once general problem is found, selectively insert more instrumentation to find specific causes

  4. Paradyn Components • Front end and user interface that allow user to • display performance visualization • use the Performance Consultant to find bottlenecks • start and stop the application • monitor status of the application • Paradyn daemons • monitor and instrument application processes • pvmd, mpid, winntd

  5. Using Paradyn • Program preparation: • Future releases will be able to instrument unmodified binary files • Current release 2.0 requires linking applications with Paradyn instrumentation libraries • Static linking is required on IBM AIX platforms • Application must be compiled with -g flag

  6. Paradyn Run-time Analysis • Paradyn is designed to either start up application processes and kill them upon exit, or to attach to and detach from running (or stopped) processes. • Attaching to a running process is currently implemented on Solaris. • Paradyn currently does not detach but only kills upon exit.

  7. Metric-Focus Pairs • Metric-focus grid based on two vectors • list of performance metrics (e.g., CPU time, blocking time, message rates, I/O rates) • list of program components (e.g., procedures, processes, message channels, barrier instances) • Cross product forms matrix from which user selects metric-focus pairs • Elements of matrix can be single-valued (e.g., current value, average, min, max) or time-histograms • Time-histogram is a fixed size data structure that records behavior of a metric over time

  8. “Where” Axis • After loading program, Paradyn adds entries for program resources to Where Axis window • files • procedures • processes • machines

  9. Multiple foci selection on Where Axis

  10. Performance Visualizations • Before or while running a program, the user can define performance visualizations in terms of metric-focus pairs • select focus from Where Axis • select metrics from Metrics Dialog Box • select visualization from Start Visualization Menu

  11. Metrics Dialog Box

  12. Start Visualization Menu

  13. Paradyn Phases • Contiguous time intervals within an application’s execution • Two kinds • global phase starts at beginning of program execution and extends to current time • local phases non-overlapping subintervals of the global phase

  14. Paradyn Phases (cont.) • Data collection for new phase occurs at finer granularity than for global phase. • Visualizations can show data for either local phase or global phase. • Performance Consultant can simultaneously search both local phase and global phase.

  15. Performance Consultant • Based on W3 Search Model • “Why” - type of performance problems • “Where” - where in the program these problems occur • “When” - time during execution during which problems occur

  16. Performance Consultant (cont.) • Automatically locates potential bottlenecks in your application • Contains definitions of a set of performance problems in terms of hypotheses - e.g., PerfMetricX > Specified Threshold • Continually selects and refines which performance metrics are enabled and for which foci • Reports bottlenecks that exist for significant portion of phase being measured

  17. Why Axis TopLevelHypothesis ExcessiveSyncWaitingTime CPUBound ExcessiveIOBlockingTime TooManySmallIOOps

  18. Why Axis (cont.) • CPUBound: Compares CPU time to the tunable constant PC_CPUThreshold • ExcessiveSyncTime: Compares total synchronization waiting time to the tunable constant PC_SyncThreshold • ExcessiveIOBlockingTime: Compares total I/O waiting time to the tunable constant PC_IOThreshold • TooManySmallIOOps: Compares average number of bytes per I/O operation to PC_IOThreshold

  19. Search History Graph • DAG with (hypothesis : focus) pairs as nodes • Top node represents (TopLevelHypothesis : WholeProgram) • Child nodes represent possible refinements • Search is expanded dantime a (hypothesis : focus) pair tests true

  20. Search History Graph (cont.) • Node status given by color • green background indicates Unknown status • white foreground indicates active test • pink background indicates hypothesis tested false • blue background indicates hypothesis tested true • yellow line represents Why Axis refinement • purple line represents Where Axis refinement

  21. Search History Graph as search begins

  22. Refinement to CPUbound hypothesis

  23. Further refinement of CPUbound hypothesis

  24. Two searches in progress

  25. Final refinement

  26. Tunable Constants • PC_CPUThreshold: used for hypothesis CPUBound • PC_SyncThreshold: used for hypothesis ExcessiveSyncWaitingTime • PC_IOThreshold: used for hypothesis ExcessiveIOBlockingTime • MinObservationTime: all tests will be continued for at least this interval of time before any conclusions are drawn. • costLimit: determines an upper bound on the total amount of instrumentation that can be active at a given time.

  27. Visualization Modules (visi’s) • External processes that use VisiLib RPC interface to access performance data in real time • Visi’s provided with Paradyn • time-histogram • bar chart • table • 3-d terrain

  28. Time Histogram with Actions and View menus expanded

  29. Barchart Visualization

  30. Table Visualization

  31. 3-d Histogram Visualization

  32. Dyninst API • http://www.cs.umd.edu/~hollings/dyninstAPI • Machine-independent interface for runtime program instrumentation • Insertion and removal of instrumentation code into and from running processes • Process and OS independent specification of instrumentation code • C++ library interface • Can be used to build debuggers, performance measurement tools, simulators, and computation steering systems

  33. Dyninst API (cont.) • Currently supported platforms • SPARC SunOS and Solaris • x86 Solaris and NT • IBM AIX/SP • DEC Alpha • Planned for near future • SGI Origin 2000

  34. Dyninst Terminology • point - location in a program where instrumentation can be inserted • snippet - representation of a bit of executable code to be inserted into a program at a point • e.g., To record number of times a procedure is invoked: • point - first instruction in the procedure • snippet - statement to increment a counter

  35. Dyninst Terminology (cont.) • thread - thread of execution, which may be a normal process or a lightweight thread • image - static representation of a program on disk • application - process being modified • mutator - program that uses the API to modify the application

  36. Using the dyninst API • Declare single object of class Bpatch • Identify application process to be modified • appThread = bpatch.createProcess(pathname, argv); • appThread = bpatch.attachProcess(pathname, processId) • Define snippet and points where it should be inserted

  37. Dyninst Example Bpatch_image *appImage; Bpatch_Vector(Bpatch_point*) *points; // Open the program image associated with the thread and return a handle to it. appImage = appThread->getImage(); // find and return the entry point to the “InterestingProcedure”. Points = appImage->findProcedurePoint(“InterestingProcedure”, Bpatch_entry); // create a counter variable (but first get a handle to the correct type). Bpatch_variableExpr *intCounter = appThread->malloc(*appImage->findType(“int”)); // create a code block to increment the integer by one. // intCounter = intCounter + 1 // Bpatch_arithExpr addone(Bpatch_assign, *intCounter, Bpath_arithExpr(Bpatch_plus, *intCounter, Bpatch_constExpr(1))); // insert the snippet of code into the application. appThread->insertBlock(addone, *points);

  38. DAIS • Dynamic Application Instrumentation System • Proposed by Douglas Pase at IBM • Platform-independent client-server library for building debugging and performance tools • Based on dyninst

  39. DAIS (cont.) • Support proposed for • code patches • periodic instrumentation • inferior remote procedure calls (IRPCs) • remote memory reads and writes • dynamic subroutine placement • process control for debugging • Planned demo tools • dynamic printf • trace capture for MPI

More Related