1 / 31

Integration of Reverse Monte-Carlo Ray Tracing within Uintah

Integration of Reverse Monte-Carlo Ray Tracing within Uintah. Todd Harman Department of Mechanical Engineering Jeremy Thornock Department of Chemical Engineering. Isaac Hunsaker. Graduate Student Department of Chemical Engineering. Deliverables.

Download Presentation

Integration of Reverse Monte-Carlo Ray Tracing within Uintah

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integration of Reverse Monte-Carlo Ray Tracing within Uintah Todd Harman Department of Mechanical Engineering Jeremy Thornock Department of Chemical Engineering Isaac Hunsaker Graduate Student Department of Chemical Engineering

  2. Deliverables • Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES. Scalability demonstration.

  3. Approach CFD: Finest level, (always) RMCRT: 1 Level: CFD 2 Level: coarsest level “Data Onion”: finest level, Research Topic: Region of Interest (ROI)

  4. 2 Levels 2 Levels

  5. Data Onion 3-Levels

  6. Data Onion: ROI Implemented Research Topic: ROI location? Static: • User defined region?

  7. Data Onion: ROI Implemented Research Topic: ROI location Dynamic: • ROI computed every timestep? (abskg sigmaT4) • ROI proportional to the size of fine level patches?

  8. Status: Completed • 80% Complete: Data Onion, dynamic & static region of interests. Testing phase, need benchmarks. • 90% Complete: Integration of RMCRT tasks within ARCHES (2 level)

  9. Status: Work in Progress • Single Level Verification Order of accuracy # rays (old) grid resolution Scalability studies, new mixed scheduler. • 2 Levels verification Errors associated with coarsening

  10. Benchmark Problem Initial Conditions: - Uniform temperature field - Analytical function for absorption coefficient S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

  11. Verification: 1L S. P. Burns and M.A Christon. Spatial domain-based parallelism in large-scale, participating-media, radiative transport applications. Numerical Heat Transfer, Part B, 31(4):401-421, 1997.

  12. Verification: 1L

  13. Verification: 2L 4X error from coarsening abskg

  14. Verification: 2L Error Coarsening: smoothing filter Abskg

  15. Collaboration Leverage the work of Dr. Berzin’s team Hybrid MPI-threaded Task Scheduler (Qingyu Meng) GPU-RMCRT (Alan Humphrey)

  16. Hybrid MPI-threaded Task Scheduler • Hybrid MPI-threaded Task Scheduler*: • Memory reduction! • 13.5Gb -> 1GBper node (12 cores/node)*. • (2 material CFD problem, 20483 cells, on 110592 cores of Jaguar) • Interconnect drivers and MPI software must be threadsafe. • RMCRT requires anMPI environmental variable expert! *Q. Meng, M. Berzins, and J. Schmidt, Using hybrid parallelism to improve memory use in uintah. In Proceeding of the Teragrid 2011.

  17. MPI-threaded Task Scheduler Kraken 100 rays per cell

  18. MPI-threaded Task Scheduler Difficult to run on Kraken, crashing in mvapich Further testing needed on bigger machines?

  19. GPU-RMCRT Nvidia M2070/90 Tesla GPU Keeneland Initial Delivery System 360 GPUs DoE Titan 1000s of GPUs + Multi-core CPU Motivation - Utilize all available hardware • Uintah’s asynchronous task-based approach is well suited to take advantage of GPUs • RMCRT is ideal for GPUs

  20. GPU-RMCRT • Offload Ray Tracing and RNG to GPU(s)‏ • Available CPU cores can perform other computation. • Uintah infrastructure supports GPU task scheduling and execution: • Can access multiple GPUs on-node • Uses Nvidia CUDA C/C++ • Using NVIDIA cuRAND Library • GPU-accelerated random number generation (RNG)‏

  21. Uintah Hybrid CPU/GPU Scheduler • Create & schedule CPU & GPU tasks • Enables Uintah to “pre-fetch” GPU data • Uintah infrastructure manages: • Queues of CUDA Stream and Event handles • Device memory allocation and transfers • Utilize all available: CPU cores and GPUs

  22. Uintah GPU Scheduler Abilities • Capability jobs run on: Keeneland Initial Delivery System (NICS)‏ 1440 CPU cores & 360 GPUs simultaneously Jaguar - GPU partition (OLCF)‏ 15360 CPU cores & 960 GPUs simultaneousl • Development of GPU RMCRT prototype underway.

  23. Status: Pending • Head-to-head comparison of RMCRT with Discrete Ordinates Method. Single level. Accuracy versus computational cost. • 2 Levels: Coarsening error for variable temperature and radiative properties. • Data Onion: Serial performance Accuracy versus number of levels, refinement ratio, dynamic/static ROI. Scalability Studies

  24. Summary Summary • Order of Accuracy: # rays0.5 , grid Cells1 • Accuracy issues related to coarsening data. • Cost = f( #rays, Grid Cells1.4-1.5 communication….) Doubling the grid resolution = 20ish X increase in cost. • Good scalability characteristics Year 2: Demonstration of a fully-coupled problem using RMCRT within ARCHES. Scalability demonstration.

  25. GPU RMCRT Acknowledgements: DoE for funding the CSAFE project from 1997-2012, DOE NETL, DOE NNSA, INCITE NSF for funding via SDCI and PetaApps Keeneland Computing Facility, supported by NSF under Contract OCI-0910735 Oak Ridge Leadership Computing Facility – DoE Jaguar XK6 System (GPU partition)‏ http://www.uintah.utah.edu

  26. Physics Isotropic scattering added to the model Verification testing performed using an exact solution (Siegel, 1987) Grid convergence analysis performed Discrepancy diminishes with increased mesh refinement

  27. Isotropic Scattering:Verification • Benchmark Case of Seigel 1987 • Cube (1m3) • Uniform Temperature 64.7K • Mirror surface on all sides • Black top and bottom walls • Computed surface fluxes on top & bottom walls • 10 rays per cell (low) Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

  28. Isotropic Scattering:Verification Radiative Flux vs Optical Thickness RMCRT (dots) Exact solution (lines) Seigel, R. “Transient Radiative Cooling of a droplet-filled layer,” ASME Journal of Heat Transfer,109:159-164, 1987.

  29. Isotropic Scattering:Verification Grid convergence of the L1 error norms where the scattering coefficient is 8 m-1, and the absorption coefficient is 2m-1.

  30. DOM vs RMCRT • IFRF burner simulation (production size run) • 1344 processors/cores • Initial conditions taken from a previous run with DOM. • Domain: (1m x 4.11 m x 1m) • Resolution: (4.4mm x 8.8mm x 4.4mm) 24 million cells

  31. DOM vs RMCRT

More Related