1 / 32

Cluster Computing

Cluster Computing. Barry Wilkinson Department of Computer Science University of North Carolina at Charlotte abw@uncc.edu. Overview of Talk. Outline of cluster computing IEEE cluster computing initiative Distributed shared memory on clusters UNCC NSF projects on teaching DSM.

dulcea
Download Presentation

Cluster Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cluster Computing Barry Wilkinson Department of Computer ScienceUniversity of North Carolina at Charlotteabw@uncc.edu

  2. Overview of Talk • Outline of cluster computing • IEEE cluster computing initiative • Distributed shared memory on clusters • UNCC NSF projects on teaching DSM

  3. Cluster Computing • Using a collection of interconnected computers together on a single problem/application • Originally became attractive with the  availability of high performance commodity computers

  4. Early Key Enabling Cluster Computing Projects • PVM (parallel virtual machine) message passing software • Developed at Oak Ridge labs • Very widely used (free) • Berkeley NOW (network of workstations) project • Beowulf clusters – commodity PCs with Linux OS

  5. More Recent Message Passing Work • MPI (Message-passing Interface) • Standard for message passing libraries • Defines routines but not implementation • Very comprehensive • Version 1 released in 1994 with 120+ routines defined • Version 2 now available

  6. Cluster Interconnects • Original Ethernet (10 Mb/s) • Fast Ethernet (100 Mb/s) • Gigabit Ethernet (1 Gb/s) • Myrinet (1.28 G b/s) • proprietary – Myricon • more expensive but lower latency that Ethernets • DEC Memory Channel Interface – for single system image (SSI)

  7. Nodes • PCs (usually Intel) • SMPs (symmetrical multiprocessor) • Multiple processors within one box usually interconnected via single bus • Heterogeneous clusters • Number of nodes • Small 2-20 • Medium 20-100 • Large 100 –1000’s • Metacomputing 10,000 ….

  8. Clusters Uses • Academic • Cost effective parallel programming teaching • Research into distributed systems • Scientific • Usually for speed, grand challenge problems • Commercial • Databases, Internet servers … for load balancing

  9. Classifications • High performance • Scalability/expandability • High throughput • High availability • Fault tolerance

  10. IEEE Task Force on Cluster Computing • Aim to foster the use and development of clusters • Obtained IEEE approval early 1999 • Main home page: http://www.dgs.monash.edu.au/~rajkumar/tfcc • Education home page: http://www.cs.uncc.edu/~abw/parallel/links.html

  11. Distributed Shared Memory • Making the main memories of a cluster of computers look as single memory with a single address space Shared Memory Processors

  12. Advantages of DSM • Hides the message passing - do not explicitly specific sending messages between processes • Simple programming model • Can handle complex and large data bases without replication or sending the data to processes

  13. Disadvantages of DSM • May incur a performance penalty • Must provide for  protection against simultaneous access to shared data (locks, etc.) • Little programmer control over actual messages being generated • Performance of irregular problems in particular may be difficult

  14. Methods of Achieving DSM • Hardware • Special network interfaces and cache coherence circuits • Software • Modifying the OS kernel • Adding a  software layer between the operating system and the application • Combined hardware/software

  15. Software DSM Implementation • Page based • Using the system’s virtual memory • Object based • Shared data within collection of objects • Access to shared data through object oriented discipline (ideally)

  16. Software Page Based DSM Implementation Shared Memory Page fault Virtual memory Table Processors

  17. Performance Issues • Achieving Consistent Memory • Data altered by one processor visible to other processors • Similar problem to multiprocessor cache coherence • Multiple writer protocols • False Sharing • Different data within same page accessed by different processors

  18. Consistency Models • Strict Consistency • Processors sees most recent update, i.e. read returns the most recent wrote to location • Sequential Consistency • Result of any execution same as an interleaving of individual programs • Relaxed Consistency • Delay making write visible to reduce messages • Release Consistency – programmer must use synchronization operators, acquire and release • Lazy Release Consistency - update only done at time of acquire

  19. Some Software DSM Systems • Ivy • “Integrated Shared Virtual Memory at Yale” • Probably the first page-based DSM system • Threadmarks • page based DSM system • JIAJIA • C based • Adsmith • C++ based

  20. Programming on a DSM Cluster • Regular C/C++ with library routines • Threads (e.g. IEEE Pthreads) • if implementation available • OpenMP • Standard being developed for shared memory programming • Provides compiler directives for shared data, synchronization, etc. • Recently Implemented on clusters

  21. Hot New Topic • Using Java on clusters • Several projects under way

  22. Cluster Computing Projects at UNCC • Two projects funded by National Science Foundation • 1995-1998 - project to introduce parallel programming into the freshman year • 1999-2001 - project to develop distributed shared memory materials

  23. Results/deliverables of1995-98 NSF Project • Textbook published for undergraduate  parallel programming using clusters - first such textbook • Conference papers, paper in IEEE Transactions on Education, 1998 • Extensive home page (http://www.cs.uncc.edu/par_prog) for educators including a complete parallel programming course (real-audio).

  24. Parallel Programming Techniques and Applications Using Networked Workstations and Parallel Computers By Barry Wilkinson and Michael Allen Prentice Hall1999 ISBN 0-13-671710-1 2nd printing 1999 Japanese translation to appear

  25. Main Parallel Programming Home Page

  26. Undergraduate Parallel Programming Teleclass • Broadcast on NC-REN network • NC Research and Education Network • Perhaps the first broadcast quality state-wide network (~1985) • Connects to 19 sites • Received simultaneously by • NC State University • UNC- Asheville • UNC-Wilmington • UNC-Greensboro

  27. New UNCC Cluster Computing Project • Current Status • Graduate student reviewing available DSM tools. Preparing a comparative study • Downloading Adsmith as first DSM system to evaluate

  28. New UNCC Cluster Computing Project • Plans • By December, 1999 - obtain selected tools • 2000 - onwardsEvaluate tools, write applicationsSet up home page for educatorsProduce second edition of textbook (or possibly a new textbook)

  29. Problems in Teaching DSM • Being able to disseminate ideas to students • Reduction of programming complexity • How to clarify performance issues • Programming for performance without low-level programming or detailed knowledge of hardware • No known educational work on DSM

  30. DSM Research • All current research DSM projects concentrate upon obtaining the highest “performance” • Concentrate around DSM implementation (memory consistency algorithms, etc.) not programming the systems • No systematic work on application programming issues

  31. Conclusions • Cluster computing offers a very attractive cost effective method of achieving high performance • Promising future

  32. Quote: Gill wrote in 1958(quoting papers back to 1953): “ … There is therefore nothing new in the basic idea of parallel programming, but only its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation.” Gill, S. (1958), “Parallel Programming,” The Computer Journal, Vol. 1, pp. 2-10.

More Related