410 likes | 912 Views
Cluster Computing. Barry Wilkinson Department of Computer Science University of North Carolina at Charlotte abw@uncc.edu. Overview of Talk. Outline of cluster computing IEEE cluster computing initiative Distributed shared memory on clusters UNCC NSF projects on teaching DSM.
E N D
Cluster Computing Barry Wilkinson Department of Computer ScienceUniversity of North Carolina at Charlotteabw@uncc.edu
Overview of Talk • Outline of cluster computing • IEEE cluster computing initiative • Distributed shared memory on clusters • UNCC NSF projects on teaching DSM
Cluster Computing • Using a collection of interconnected computers together on a single problem/application • Originally became attractive with the availability of high performance commodity computers
Early Key Enabling Cluster Computing Projects • PVM (parallel virtual machine) message passing software • Developed at Oak Ridge labs • Very widely used (free) • Berkeley NOW (network of workstations) project • Beowulf clusters – commodity PCs with Linux OS
More Recent Message Passing Work • MPI (Message-passing Interface) • Standard for message passing libraries • Defines routines but not implementation • Very comprehensive • Version 1 released in 1994 with 120+ routines defined • Version 2 now available
Cluster Interconnects • Original Ethernet (10 Mb/s) • Fast Ethernet (100 Mb/s) • Gigabit Ethernet (1 Gb/s) • Myrinet (1.28 G b/s) • proprietary – Myricon • more expensive but lower latency that Ethernets • DEC Memory Channel Interface – for single system image (SSI)
Nodes • PCs (usually Intel) • SMPs (symmetrical multiprocessor) • Multiple processors within one box usually interconnected via single bus • Heterogeneous clusters • Number of nodes • Small 2-20 • Medium 20-100 • Large 100 –1000’s • Metacomputing 10,000 ….
Clusters Uses • Academic • Cost effective parallel programming teaching • Research into distributed systems • Scientific • Usually for speed, grand challenge problems • Commercial • Databases, Internet servers … for load balancing
Classifications • High performance • Scalability/expandability • High throughput • High availability • Fault tolerance
IEEE Task Force on Cluster Computing • Aim to foster the use and development of clusters • Obtained IEEE approval early 1999 • Main home page: http://www.dgs.monash.edu.au/~rajkumar/tfcc • Education home page: http://www.cs.uncc.edu/~abw/parallel/links.html
Distributed Shared Memory • Making the main memories of a cluster of computers look as single memory with a single address space Shared Memory Processors
Advantages of DSM • Hides the message passing - do not explicitly specific sending messages between processes • Simple programming model • Can handle complex and large data bases without replication or sending the data to processes
Disadvantages of DSM • May incur a performance penalty • Must provide for protection against simultaneous access to shared data (locks, etc.) • Little programmer control over actual messages being generated • Performance of irregular problems in particular may be difficult
Methods of Achieving DSM • Hardware • Special network interfaces and cache coherence circuits • Software • Modifying the OS kernel • Adding a software layer between the operating system and the application • Combined hardware/software
Software DSM Implementation • Page based • Using the system’s virtual memory • Object based • Shared data within collection of objects • Access to shared data through object oriented discipline (ideally)
Software Page Based DSM Implementation Shared Memory Page fault Virtual memory Table Processors
Performance Issues • Achieving Consistent Memory • Data altered by one processor visible to other processors • Similar problem to multiprocessor cache coherence • Multiple writer protocols • False Sharing • Different data within same page accessed by different processors
Consistency Models • Strict Consistency • Processors sees most recent update, i.e. read returns the most recent wrote to location • Sequential Consistency • Result of any execution same as an interleaving of individual programs • Relaxed Consistency • Delay making write visible to reduce messages • Release Consistency – programmer must use synchronization operators, acquire and release • Lazy Release Consistency - update only done at time of acquire
Some Software DSM Systems • Ivy • “Integrated Shared Virtual Memory at Yale” • Probably the first page-based DSM system • Threadmarks • page based DSM system • JIAJIA • C based • Adsmith • C++ based
Programming on a DSM Cluster • Regular C/C++ with library routines • Threads (e.g. IEEE Pthreads) • if implementation available • OpenMP • Standard being developed for shared memory programming • Provides compiler directives for shared data, synchronization, etc. • Recently Implemented on clusters
Hot New Topic • Using Java on clusters • Several projects under way
Cluster Computing Projects at UNCC • Two projects funded by National Science Foundation • 1995-1998 - project to introduce parallel programming into the freshman year • 1999-2001 - project to develop distributed shared memory materials
Results/deliverables of1995-98 NSF Project • Textbook published for undergraduate parallel programming using clusters - first such textbook • Conference papers, paper in IEEE Transactions on Education, 1998 • Extensive home page (http://www.cs.uncc.edu/par_prog) for educators including a complete parallel programming course (real-audio).
Parallel Programming Techniques and Applications Using Networked Workstations and Parallel Computers By Barry Wilkinson and Michael Allen Prentice Hall1999 ISBN 0-13-671710-1 2nd printing 1999 Japanese translation to appear
Undergraduate Parallel Programming Teleclass • Broadcast on NC-REN network • NC Research and Education Network • Perhaps the first broadcast quality state-wide network (~1985) • Connects to 19 sites • Received simultaneously by • NC State University • UNC- Asheville • UNC-Wilmington • UNC-Greensboro
New UNCC Cluster Computing Project • Current Status • Graduate student reviewing available DSM tools. Preparing a comparative study • Downloading Adsmith as first DSM system to evaluate
New UNCC Cluster Computing Project • Plans • By December, 1999 - obtain selected tools • 2000 - onwardsEvaluate tools, write applicationsSet up home page for educatorsProduce second edition of textbook (or possibly a new textbook)
Problems in Teaching DSM • Being able to disseminate ideas to students • Reduction of programming complexity • How to clarify performance issues • Programming for performance without low-level programming or detailed knowledge of hardware • No known educational work on DSM
DSM Research • All current research DSM projects concentrate upon obtaining the highest “performance” • Concentrate around DSM implementation (memory consistency algorithms, etc.) not programming the systems • No systematic work on application programming issues
Conclusions • Cluster computing offers a very attractive cost effective method of achieving high performance • Promising future
Quote: Gill wrote in 1958(quoting papers back to 1953): “ … There is therefore nothing new in the basic idea of parallel programming, but only its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation.” Gill, S. (1958), “Parallel Programming,” The Computer Journal, Vol. 1, pp. 2-10.