1 / 13

High Performance Computing Cluster OSCAR

High Performance Computing Cluster OSCAR . CPSC 424/624 Project ( 2011 Spring ) . Instructor Dr. Grossman. Team Member Jin Wei , Pengfei Xuan. Outline. Backgroud. 1. Installation. 2. Management. 3. Security. 4. A dministration. 5. DIY Supercomputer. ?.

abner
Download Presentation

High Performance Computing Cluster OSCAR

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Computing ClusterOSCAR CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman Team MemberJin Wei , Pengfei Xuan

  2. Outline Backgroud 1 Installation 2 Management 3 Security 4 Administration 5

  3. DIY Supercomputer ? HPC= Computer + Network • + OS • + Management Software

  4. Background Introduction Clemson Palmetto 12,392 cores 92.48 TeraFlops • TOP1:Tianhe-1A (China) • 186, 368 cores • 4,701 TeraFlops

  5. HPC Network Topology 3 Set of Networks Management Parallel Computing Storage Centralized Storage

  6. Installation Easy Management Batch OS install Batch software install

  7. Management Cluster Management Partition a cluster into multiple logical computers Maps logical computers (clusters) onto servers (nodes) Multiple independent OS configurations Manages and monitors logical computer (clusters) status Cluster status to management system Job scheduling and management Manages and monitors operating system instances (nodes) Node status to management system System Management Management of overall system configuration Redundant management servers with automatic failover Designed to anticipate and tolerate failures

  8. Management Server Management Automatic discovery of server hardware Remote server control (Power On/Off, Cycle) Scalable fast diskless or data-less booting for large node count systems Server redundancy and failover Provides server status to the management system Network Management Automatic discovery of interconnect hardware Multiple interconnect fabric topologies Redundant paths and networks Load balancing and failover Network status to the management system Storage Management Scalable root file systems for diskless or data-less nodes Multiple global storage configurations High BW to secondary storage for data and check pointing Provides server status to the management system

  9. Security Control Model

  10. Administration ( C3 Tool Suite ) cexec: executes any standard command on all cluster nodese.g. cexecmkdir /tmp ckill: terminates a user specified process on all cluster nodese.g. ckillmy_program_abc cget: retrieves files or directories from all cluster nodes cpush: distribute files or directories to all cluster nodes cpushimage: update the system image on all cluster nodes using an image captured by the System Imager tool crm: remove files or directories from all cluster nodes cshutdown: shutdown or restart all cluster nodes cnum: returns a node range number based on node name cname: returns node names based on node ranges clist: returns all clusters and their type in a configuration file 'Cluster Command & Control' (C3)

  11. Other Administration Tools System Installation Suite (SIS) : Install the client nodes. SIS also provides the database from which OSCAR obtains its cluster configuration information. The main concept to understand about SIS is that it is an image based install tool. An image is basically a copy of all the files that get installed on a client. This image is stored on the server and can be accessed for customizations or updates. You can even chroot into the image and perform builds. Switcher Environment Manager: Provide a simple mechanism to allow users to manipulate their environment

  12. References [1] http://svn.oscar.openclustergroup.org/trac/oscar/ wiki/InstallGuideIntroduction. [2] M.J. Brim, T.G. Mattson, "OSCAR: Open Source Cluster Application Resources".. [3] B.Luethke, S. Scott and T. Naughton, "OSCAR Cluster Administration With C3". [4] C3, http://www.csm.ornl.gov/torc/C3

  13. Question? Thank you!

More Related