340 likes | 393 Views
Algorithm and Scaling (Issues) for Aerospace (CFD) Codes. Sukumar Chakravarthy src@metacomptech.com www.metacomptech.com. Scope of Presentation. Range of aerospace CFD and related applications Hierarchy of simulation approaches Hierarchy of algorithmic approaches
E N D
Algorithm and Scaling (Issues) for Aerospace (CFD) Codes SukumarChakravarthy src@metacomptech.com www.metacomptech.com
Scope of Presentation • Range of aerospace CFD and related applications • Hierarchy of simulation approaches • Hierarchy of algorithmic approaches • Algorithm and scalability issues and considerations
Presentation Approach & Goals • A picture is worth a thousand words • We will use ten thousand words and 1 picture == eleven thousand word-equivalents • Catalog, serve as collective conscience • Discuss relationship between application needs, algorithms, modeling approaches and HPC issues and possibilities
CFD++ Aerospace Applications • External aerodynamics • Propulsion integration • Component integration • Systems • Cabin airflow • FADEC • Icing • Fuel tank purge • Thrust reverser • Propulsion • Nozzle design • Jet noise
CFD++ Aerospace Applications • Plumes • Trajectory • Aerodynamic coefficients • Drag polar • Dynamic derivatives • Store separation • Canopy separation • Sabot separation • Stage separation • Pilot seat ejection • Projectiles • Spinning projectiles
CFD++ Aerospace Applications • Synthetic jets • Turbomachinery • Blade design • Blade cooling • Pulsed detonation • Flapping wings • Flexible wings • Entomopters • Helicopters • Propellers, rotors • Parachutes • Parachutists, sky-diving
CFD++ Aerospace Applications • Spacecraft launch • Reentry vehicles • Rocket assisted landings (Earth, Mars, Venus) • X-Prize vehicles • Land speed record vehicles • Bullets, artillery rounds • Liquid fuel breakup • Liquid fuel sloshing, feed • Acceleration, deceleration effects • Aeroacoustics • Flow Structure Interaction (FSI)
What’s special about Aerospace CFD? • Extremes of scales, operating conditions, physics and chemistry, speeds, application-specific needs (extraction of useful information) • Nonlinearity is most often inherent • It is not just the simulation itself that counts • If there is no information output required, no need to do the simulation
Hierarchy of problem classes • Steady state/unsteady problems • Small, medium and large scale problems • Entire configurations as well as analysis of components • Engineering analysis, scientific analysis, trouble shooting • All speeds, atmospheric conditions, diverse fluids and their properties
Common Elements of Simulations Physics (nature) Math Model of Physics Numerical Model of Math Model Computational Model Human(s) in the loop Simulation Results
Convection: Production: Dissipation: Redistribution: Diffusion: Evolution: Common Underlying Physical Processes
Summary of some HPC issues • Loading the problem, saving final results • Checkpointing • Computational vs. communications performance (scalability) • Data extraction issues • Robustness (10000-way parallel should beas robust as serial algorithm) • Data-center issues (throughput, storage) • Visualization, interaction with running case
Modeling Hierarchy • Potential flow assumption • Small-disturbance approaches • Inviscid-flows taken separately, and hybridized with boundary layer theory • Reynolds/Favre-averaged N-S equations with phenomenological turbulence models • LES and hybrid RANS-LES approaches • Special equations and models
Mesh possibilities • Surface mesh only (panel methods) • Cartesian mesh, almost Cartesian mesh • Structured mesh – hex (3D) & quad (2D) • Unstructured – all cell types • Hybrid structured and unstructured meshes, hex-core meshes • Patched and overset meshes • Moving (dynamic) meshes • Flexible boundaries and meshes
“Extreme Grids” • Aspect ratios of 10000 to 1 or more (boundary layer resolution with Y+ < 1) • Mesh sizes of hundreds of million and more • Extreme grid spacingspresent in mesh
Numerical approaches • Explicit and implicit • Fractional steps and factored schemes • Finite volume, finite difference schemes • Finite element schemes • Spectral and spectral element schemes • “Local” schemes and “global” schemes
Some HPC algorithmic challenges • Challenges of making implicit schemes be really implicit on multi-CPU computations • Ensure insensitivity of results to variations in number of parallel processes used • How to make the 10000-way parallel computation as robust as the serial algorithm • How to make the 10000-way parallel computation converge as wellbut in much less time
Adaptive meshes • Adaptive elements (cells) • Adaptive grids • H-adaptation, P-adaptation, H-P-adaptation
Classification of Algorithms • Low information density schemes – expand stencil to improve accuracy • High information density schemes – expand information content per cell (e.g. use values and derivatives, or values at multiple collocation points) • Homogeneity (or lack of) of discretization and solution methodology • Homogeneity (or lack of) underlying physics models
The usual scalability considerations • Computation and communication • Computation versus communication • Overlap of computation and communication • Bulk of communication for local schemes can follow pattern of one to a few connectivity • Global operations – global reductions often determine scalability
Recent Scalability Improvements • CFD++ now scales well to very large number of cores • The scalability improvements are universal – they apply to all modern HPC platforms from all vendors • Tests have shown effective performance all the way up to 4096 cores • Even relatively small grids (e.g. 16 million cells) scale well to 2048 or even 4096 cores, depending on computer and type of case run • Goal – to demonstrate similar performance on 10000 to 40000 cores Ex 1: 33M cells, Computer 1, Case 1 Ex 2: 16M cells, Computer 2, Case 2
Some Influences on Scalability • Effect of physics – increased sophistication means more computation, often more scalability • Effect of numerics – increased accuracy means more computation, and more communication, often more scalability • Effect of grid – more grid means more computation and less communication for “local” algorithms
Additional thoughts on Parallel Processing • Two ways of using multiple compute engines • Parallel computations • Pipelined computations • Pipelined algorithms have not been exploited too much at the HPC level • Process level and thread level parallelism beginning to be combined (e.g. to exploit GPGPUs)
Load balancing issues • Structured vs. unstructured grids (usually solved by weighted domain decomposition) • Adaptive algorithms and adaptive meshes • Different physics in different regions • Moving meshes and overset meshes
Optimization considerations • Parallel algorithms for optimization • How to use large numbers of processors • E.g. Do many cases in parallel • Pre-compute cases matrix, sensitivity, etc. and then train neural networks or tabulate sensitivity before applying optimization procedure
Multi-physics considerations • Communications between non-homogeneous simulation tools • Communications between diverse hardware platforms • Tight coupling vs. loose coupling considerations
Need for Parallel I/O and File systems • Very large scale problems • Very large number of processors • Initial load and final save + intermediate data output • Asymmetric data extraction needs
Typical “post-processing” needs • Global information (forces and moments, lift, drag, torque) • Semi-global information (forces and moments along wing span, along fuselage) • Reduced subsets – iso-surfaces, surface data, cut-planes • Time-averages versus instantaneous values • In-situ “post”-processing can be very useful
Single and Distributed File Parallel I/O • Parallel I/O (PIO) can be accomplished in two ways • In Single-File mode, PIO reads and writes from the current full-mesh/full-solution files. • In Distributed-File mode, PIO reads and writes from a set of files (e.g. placed in subdirectories) associated with each parallel process
Interactive massively parallel computing • Steady state versus Transient (unsteady) computations • Links with front-end and graphical processing • Even post processing of large scale problems may require substantial parallel computing resources • One should not just focus on the “batch” computing model
Some elements of the balancing act • Computation • Communication • Memory requirements • I/O requirements • Accuracy requirements • Robustness requirements • In-situ solution processing requirements
Bandwidths to consider • Number of cores vs. number of I/O channels • Memory bandwidth from core to memory • Memory access conflicts
Some old ideas revisited • Paying more attention to connectivity architecture • Minimization of hops • Domain decomposition that minimizes traffic between switches • How many switches or hops (groups of nodes), how many nodes, how many processors in a node, how many cores per processor
Final thoughts • The challenge of producing codes that work in the user’s hands and computing facilities • Ease of use • Scalability and effectiveness vs. just scalability • Resource maximization versus minimization • What can be done with less • What can be done with more • What more can be done with less Thank you