180 likes | 288 Views
A Cell-by-Cell AMR Method for the PPM Hydrodynamics Code. Dennis C. Dinge dennis@lcse.umn.edu University of Minnesota. http://www.lcse.umn.edu/~dennis/. Overview. 1) The AMR scheme: What we do that’s different from most. Where we refine.
E N D
A Cell-by-Cell AMR Method for the PPM Hydrodynamics Code • Dennis C. Dinge • dennis@lcse.umn.edu • University of Minnesota • http://www.lcse.umn.edu/~dennis/
Overview • 1) The AMR scheme: What we do that’s different from most. • Where we refine. • How the boundaries of refined regions are done. • The ordering of sweeps at different refinement levels • 2) Parallelization Method Overview: • How the problem is broken up and parallelized. • Storage on all scales. 3) Some Results and Concluding Remarks
Where we refine. • 1) Walls, Shocks, and Contact Discontinuities. • Refined regions are fronts of dimension one less than the problem • dimension. We exploit AMR’s ability to capture and follow these fronts. • Only two levels of refinement are currently done. • 2) We don’t use AMR to refine entire regions of the problem for which • standard techniques of non-uniform grids and simple grid motion are • adequate. • 3) Decision on refinement is made on a cell by cell basis. • 4) Cells which were marked for refinement by a previous transverse sweep are • refined in the current sweep whether the current sweep thinks they should • be refined or not. • A front must be undetected in all directional sweeps to be dropped from • refinement.
How the boundaries of refined regions are done. • 1) Since the refined regions in this scheme are so thin it is highly • desirable to reduce the number of ghost zones at the ends of the • refined rows. • 2) The number of ghost zones is reduced by retaining information from • the coarser calculations. • The PPM parabolic coefficients for pressure, velocity and density • are retained from the coarser sweep. These are used to make their • finer grid counterparts along with left and right edge values for density • velocity, and pressure in the grid end zones. • 3) The fluxes for mass, momentum and energy are also retained and used • to recalculate values for coarser cells at the ends of refined regions. • This recalculation is necessary because the fluxes entering the end • cells from the refined region will in general differ from their coarser • counterparts.
The ordering of sweeps on different levels. 1) For a particular row a coarse sweep is first done over all cells. 2) If coarse cells which must be refined are detected during the sweep they are divided evenly into 4 parts and the resulting subcells are grouped into rows of fine grid cells. End Cell End Cell 3) Sweeps are then done over the fine grid cells.
End Cell End Cell End Cell End Cell 4) If any of the fine grid sweeps detects the need to refine again upper and lower finer grid refinement and sweeps are done in like manner. 5) At the end of each sweep end values be they coarse or fine are updated to account for new values of fluxes. 6) Interior values at the first level of refinement are updated in light of new information from the second level of refinement and these new values are used to update the coarse grid.
End Cell End Cell Upper and lower Fine Sweep
End Cell End Cell End Cell End Cell Upper and lower Finer Sweep
Coarse Sweep End Cell End Cell Upper and lower Fine Sweep End Cell End Cell End Cell End Cell Upper and lower Finer Sweep
Doing the sweeps in this way rather than doing the entire coarse • sweep followed by an entire fine sweep and an entire finer sweep • should make better use of the cache. • But it will mean that the amount of work may differ greatly from • row to row. • As will be explained below this does not present a problem for the • parallelization scheme we employ.
1) On the largest scale the problem is broken into patches called tiles in 2D and bricks in 3D. • Globally the tiles are stored as large 1D arrays with interior, edge and corner values stored contiguously. • At the end of a sweep the edge or corner data for one tile is updated with the proper interior data from another or the appropriate boundary condition. • Tiles may be solved in any order with semaphores assuring that • dependant tiles are complete before a particular tile begins. • Parallelization is done with Open MP • 2) Each tile is solved using a standard X Y Y X sequence of sweeps. • The Y sweeps are broken in half so the sequence is really X, Y lower, Y upper, Y lower, Y upper, X. • This will allow the code to proceed without waiting on whole X or Y sweeps.
3) For each sweep the half tile is broken into strips. • The size of strips in terms of rows is an adjustable variable. • The strips may be solved in any order with spin waits assuring • that necessary strips from previous sweeps are complete before a • particular strip is allowed to proceed. • 4) There are many more strips than CPUs. • Work within strips may differ greatly because the refinement of • rows within the strips may differ greatly. But, as long as the number • of strips is large compared to the number of CPUs the overall process • will be load balanced.
Storage on all scales • 1) Coarse values are stored locally in a NVAR by I by J array. Where • NVAR is the number of variables for each cell and I and J are the • number of cells in the X and Y respectively for the tile. • 2) Refined information is stored in two arrays. A 3 by I by J integer array • points to a compressed 1D array with refined data for a cell. • A value of zero in the pointer array means the cell has no fine grid • structure. This is the case for most cells. • Three values are used to keep track of refinement detections in the last • X pass the last Y pass and the last pass be it X or Y. • The pointer arrays points to the location for the first level of refinement. • This location has pointers to values of a second level of refinement if • one exists within the same compressed array. • 3) Information for the coarse grid and pointer array is passes between tiles in • chunks of standard size. The amount of compressed array data passed will • depend on the amount of refinement in the “ghost zones” of adjacent tiles. • The compressed array information passed to a tile is added to the end of that • tiles fine grid information for it’s interior.
Some Results AMR run showing the 2D shock tube problem at ten, thirty, and fifty thousand cycles. Top panels show density. Bottom panels show where the zones are being refined by the AMR. Red indicates one level of refinement. White indicates two levels of refinement.
Some Results Comparison of AMR top with high resolution run at ten, thirty, and fifty thousand cycles. The resolution of the high resolution run was 16 times that of the coarse grid the AMR started with. Or equivalent to the AMR refining twice everywhere.
Some Results Comparison of AMR top with low resolution run at ten, thirty, and fifty thousand cycles. The resolution of the low resolution run was the same as the coarse grid the AMR started with. Or equivalent to the AMR refining nowhere.
Concluding Remarks 1) A working 2D parallel version is being tested. 2) A 3D parallel version is in preparation.