1 / 12

Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.

Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows. Overlap regions. If values from one processor must be communicated to another, then those values are “duplicated” on each processor It is as if that region exists on both processors Hence, overlap.

ciara-bowen
Download Presentation

Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Two allocations of a 16X16 array to 16 processes: (a) 2-dimensional blocks; (b) rows.

  2. Overlap regions • If values from one processor must be communicated to another, then those values are “duplicated” on each processor • It is as if that region exists on both processors • Hence, overlap

  3. Overlap regions (gray) show the non-local values; once the overlay regions are filled, the stencil computation is local.

  4. Cyclic and Block Allocations • Some algorithms will cause some processors to finish before others when using trivial data mappings. • For example, Gaussian Elimination • After first pass, done with first column and row, etc. • Row, column or block assignment will have some processors idle while just a few are working at the end of the process. 5-4

  5. (a) LU decomposition algorithm; (b) 16 processes arranged in a grid; (c) the allocation of the array elements to processes. 5-5

  6. Illustration of a cyclic distribution of an 8 × 8 array onto five processes. 5-6

  7. Block-cyclic allocation of 3 × 2 blocks to a 14 × 14 array distributed to four processes (colors). 5-7

  8. The block-cyclic allocation midway through the computation; the blocks to the right summarize the active values for each process. 5-8

  9. Example of an unstructured grid representing the pressure distribution on two airfoils. Image from http://fun3d.larc.nasa.gov/example-24.html. 5-9

  10. Cap allocation for a binary tree on P = 8 processes. Each process is allocated one of the leaf subtrees, along with a copy of the cap (shaded). 5-10

  11. Logical tree representations: (a) a binary tree where P = 8; (b) a binary tree where P = 6. Useful solution when the tree is known at the beginning of the computation. 5-11

  12. Enumerating the Tic-Tac-Toe game tree; a process is assigned to search the games beginning with each of the four initial move sequences. Symmetric positions are redundant. 5-12

More Related