1 / 16

Cesar R. S. da Silva 1 Pedro R. C. da Silveira 1 Renata M. Wentzcovitch 1,2

VLab: A Collaborative Cyberinfrastructure for Computations of Materials Properties at High Pressures and Temperatures. Cesar R. S. da Silva 1 Pedro R. C. da Silveira 1 Renata M. Wentzcovitch 1,2. 1 Minnesota Supercomputing Institute, University of Minnesota

piera
Download Presentation

Cesar R. S. da Silva 1 Pedro R. C. da Silveira 1 Renata M. Wentzcovitch 1,2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VLab: A Collaborative Cyberinfrastructure for Computations of Materials Properties at High Pressures and Temperatures Cesar R. S. da Silva1 Pedro R. C. da Silveira1 Renata M. Wentzcovitch1,2 1Minnesota Supercomputing Institute, University of Minnesota 2Department of Chemical Engineering and Materials Science, University of Minnesota Work Sponsored by NSF grant ITR-0426757

  2. “VLab is a cyberinfrastructure aimed to facilitate execution of complex calculations - mostly parameter sampling workflows - of materials at high pressures and temperatures.” Outline • Parameter Sampling Workflows - High P,T Cij as example • Basic Problem: -Job deluge • Proposed Solution: • - Features • - Performance • Overall Requirements • Workflow Support Specific Requirements • Service Oriented Architecture

  3. Thermodynamic Method • VDoS and F(T,V) within the QHA Fitted at several temperatures either by - Vinet EOS, or - N-th (N=3,4,5…) order isothermal (eulerian) finite strain EoS

  4. Thermoelastic constant tensor CijS(T,P) kl equilibrium structure (Pn) re-optimize

  5. Basic Problem Demand for Extensive Parameter Sampling {Pn}x{qi} => ~102 jobs Typical High (P,T) study (ex. Thermal Properties) {Pn}x{i}x{qj} => ~103-4 jobs Huge High (P,T) study ( Cij(P,T) ) • 102-104 Jobs to prepare, submit and monitor • Manual work is prone to human errors • First Principles • => Sheer number (1015-1020) of operations (Today) • => Well over 1022 in 3-5 years - Wow can High (P,T) Materials Computations be improved?

  6. The VLab - Consolidated Web Interface (Portal) to a set of tools: - Quantum ESPRESSO Package tools - Input preparation for pwscf, phonon, workflows, etc … - Data Analysis Tools - Visualization Tools (VTK/OpenGL) - etc. … - Workflow Management - Task Distribution and Data Recollection Leverages computing capabilities of distributed resources (TeraGrid, Compute Farms, scattered resources, other grids) Collaboration through shared access to resources

  7. The Big Challenge of Performance • Scale-up approach is difficult • Limited number of processors in a single system • Even using the fastest vector processors is not enough • Trend is towards denser processing, not faster single-thread execution • MPP systems are not cost effective for this class of problems • FFT and matrix transposition: Limited scalability or • Low performance per processor Proposed Solution: Leveraging Concurrent Computing for features and performance High Performance Parallel Computing High Throughput Distributed Processing

  8. Vlab - Not Just a Client/Server The Client/Server Approach: -The portal and the supporting modules have access to a large central multi-processor system. -Can work as a facilitator but lacks other important features found in VLab. -No Flexibility of Scheduling -No redundancy => Poor availability -No choice for cost (usually High)

  9. Vlab - Not Just a Client/Server The VLab Distributed System Approach: -No central system to fail and bring everything down! • -Distributed resources are replicated for: • Redundancy • Performance • Flexibility • -More Flexible Scheduling for: • Cost • Turnaround Time • Job Throughput • Workload Balance • System Throughput

  10. VLAB requirements • Workflow management => Facilitator • Support for distributed computations • Ease of use • Support for collaboration • Flexibility (update/add tools, new features) • Fault tolerance • Diversity of tools • analysis, visualization, data reduction, storage, etc .

  11. VLab Workflows Typical VLab workflows, like the High-T Cij calculation involve iterations through the following steps: 1) Prepare inputs for tasks, and generate execution packages containing required files. 2) Dispatch the execution packages to compute nodes for execution. 3) Gather results for analysis and eventually iterate steps 1-3. • Results always return to the input sources • => Tree-like service architecture

  12. VLab Service Oriented ArchitectureOn the Web: http://dasilveira.msi.umn.edu:8080/vlab/ Usage oriented view of VLab SOA => Tree-like structure in 4 layers: 1) User Interface (Portal) 2) Workflow control and monitoring (Project Executor / Interaction) 3) Task Dispatching / Interaction, task data retrieving, Auxiliary Services 4) Heavy computations and Visualization resources layer.

  13. Fault Tolerance • Only Project Executor sessions and few user and project interaction sessions are required to be persistent. Therefore, a simple approach to Fault Tolerance (FT) is possible: • Reactive: We have not identified any need for proactive FT. • Registry Based: Persistent sessions are registered and must periodically inform the registry about its "alive" state. • Redundant Registry and Metadata DB for data persistence • Fully Journaling (data and metadata) of Critical Transactions for data and metadata integrity. This guarantee the state of any persistent session can be restored in case of failure.

  14. Scheduling The usual approach: -Use agents that interact with the broker Problem: Agents are not stateless! -More complicated to develop -Persistence must be guaranteed The VLab approach: -Use an independent WS to monitor workload. -Persistence of data is provided by a local DB. -Compute WS and Workload Monitor are stateless!

  15. VLab in Action Watch a demonstration movie at vlab.msi.umn.edu -> Follow the links “portal” -> “movie” • Calculation of High P,T Thermodynamic Properties • Cubic MgO • 2 atom cell • Static + Lattice Dynamics calculation {Pn}x{i} sampling • Show distributed computing capabilities • Ability to integrate visualization and data analysis tools

  16. VLab Workflows Left: Extensive High-T Cij Right: Detailed View of Cij and phonon

More Related