1 / 14

OpenMP

OpenMP. Open Specifications for Multi Processing. OpenMP is an API used for multi-threaded, shared memory parallelism Compiler Directives Runtime Library Routines Environment Variables Portable Standardized Available on PSI and CITRIS. OpenMP allows for a higher level of abstraction

sandra_john
Download Presentation

OpenMP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenMP Open Specifications for Multi Processing

  2. OpenMP is an API used for multi-threaded, shared memory parallelism • Compiler Directives • Runtime Library Routines • Environment Variables • Portable • Standardized • Available on PSI and CITRIS

  3. OpenMP allows for a higher level of abstraction • Easier to finesse a serial code into a parallel version via OpenMP • OpenMP pragmas ignored in serial compilation • Scoping of thread-safe data is simplified OpenMPvsPThreads

  4. Start out executing the program with one master thread • Master thread forks worker threads • Worker threads die or suspend at end of parallel code Fork/Join Parallelism Image courtesy of http://www.llnl.gov/computing/tutorials/openMP/

  5. for (i=0; i<max; i++) zero[i] = 0; • For loop must have a canonical shape for OpenMP to parallelize it • Necessary for run-time system to determine loop iterations • No premature exits from the loop allowed • ie. break, return, exit, goto statements Simple Parallelization

  6. #pragmaomp parallel for for (i=0; i<max; i++) zero[i] = 0; • Pragmas help compiler optimize • Master thread creates additional threads, each with a separate execution context • All variables declared outside parallel forpragma are shared by default, except for loop index parallel for pragma

  7. How many threads will OpenMP create? • Defined by OMP_NUM_THREADS environment variable • Set this variable to the maximum number of threads you want OpenMP to use Thread Creation

  8. for (i = 0; i < height; i++) for (j = 0; j < width; j++) c[i][j] = 2; • Want to parallelize outer loop as well as inner • What’s the problem with placing a parallel for pragma above the outer loop? Private Variables

  9. Need to declare j a private variable • Use a private clause to create a private copy of j for inside loop #pragmaomp parallel for private(j) for (i = 0; i < height; i++) for (j = 0; j < width; j++) c[i][j] = 2; • Value of j is undefined at start and exit of loop • What if we need to initialize a private variable? private Clause

  10. firstprivate: private variables with initial values copied from the master thread’s copy • lastprivate: last sequential iteration of the loop is copied into master thread’s copy of variable private Variances

  11. Can help OpenMP decide how to handle parallelism schedule(type [,chunk]) • Types • Static – Iterations divided into size chunk, if specified, and statically assigned to threads • Dynamic – Iterations divided into size chunk, if specified, and dynamically scheduled among threads schedule clause

  12. Can indicate an entire block of code to execute in parallel • Use #pragmaomp parallel before a single line or block of code enclosed by curly braces • All threads, including the master thread, will execute everything in the block • Reduces overhead by forking once for a set of parallel chunks General Parallelism

  13. When updating a shared variable, may need to do so atomically inti=0; #pragmaomp parallel { : i++; : } • Thread might be swapped out after reading i value but before storing it • Use the atomic directive to ensure that a thread cannot be swapped out before completion of task #pragmaomp atomic i++; atomic directive

  14. Useful when entering sections of code that are not thread-safe (ie. I/O) • Place single directives around the block of code that only one thread should perform • Other threads wait at end of single block for the executing thread to finish single directive

More Related