270 likes | 377 Views
Software infrastructure for the I-WAY high-performance distributed computing experiment. Ian Foster, Jonathan Geisler, Bill Nickless, Warren Smith, and Steven Tuecke Grid Computing - Making the Global Infrastructure a Reality, chapter 4, pages pp. 101~106. Wiley and Sons. Outline.
E N D
Software infrastructure for the I-WAY high-performance distributed computing experiment Ian Foster, Jonathan Geisler, Bill Nickless, Warren Smith, and Steven Tuecke Grid Computing - Making the Global Infrastructure a Reality, chapter 4, pages pp. 101~106. Wiley and Sons
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
I-WAY • In brief, the I-WAY was an ATM network connecting supercomputers, mass storage systems, and advanced visualization devices at 17 different sites within North America. • I-Soft, I-POP, I-WAY
Novel concepts and techniques • Point of presence machines • Scheduler proxies • Authorization proxies • Network-aware parallel programming tools
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
The I-WAY network • The I-WAY network connected display devices (CAVE, ImmersaDesk) mass storage systems specialized instruments supercomputers of different architectures… • Why ATM? ATM was chosen rather than traditional Internet connectivity because it provides a broader bandwidth and is able to handle audio, video, and data more efficiently.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Point of presence machines • I-POP It provide uniform authentication, resource reservation, process creation, and communication functions across I-WAY resources. • I-Soft It was a software environment deployed on these I-POP machines. It provides a variety of services. 1. scheduling 2. security 3. parallel programming support 4. a distributed file system
I-POP discussion • All I-POPs shared a single AFS cell proved extremely useful as a means of maintaining a single, shared copy of I-Soft code and as a mechanism for distributing I-WAY scheduling information. • We never exploited this capability to monitor or control the ATM network.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Scheduler design • Computational Resource Broker (CRB) Requests are handled by an independent entity (CRB), which then negotiates with the site schedulers that manage individual resources. In the I-WAY, one was sufficient. • Virtual machines Predefined disjoint subsets of I-WAY computers. • User-to-CRB and CRB-to-resource protocols
Scheduler design (cont.) • Functions of scheduler 1. management functions 2. user functions • Central scheduler and local scheduler • Two-part strategy 1. Central scheduler daemon that managed and allocated time on the different virtual machines on a first-come, first-served basis. 2. A local scheduler daemon communicating directly with the local site scheduler. Local schedulers performed site-dependent actions in response to requests from the central scheduler to allocate resources, create processes, and deallocate resources.
Scheduler discussion • Limitations 1. Too-restrictive interfaces between user and scheduler and scheduler and local resources. 2.The concept of using fixed virtual machines as schedulable units was only moderately successful. 3.The long-term solution probably is to develop more sophisticated schedulers for resources that are to be incorporated into I-WAY– like systems.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Security design • Two parts authentication to the I-POP environment authentication to the local sites • Authentication to I-POPs was handled by using a telnet client modified to use Kerberos authentication and encryption. • The scheduler software served as an ‘authentication proxy.’
Security discussion • Authenticate once • A more fundamental limitation of the I-WAY authentication scheme as implemented was that each user had to have an account at each site to which access was required.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Parallel tools design • I-WAY must support the creation of processes on different processors and the communication of data between these processes. • These tools should ideally relieve the programmer of the need to consider low-level details relating to network structure.
Parallel tools design (cont.) • Irsh and ixterm • Nexus multithreaded communication library Nexus supports automatic configuration mechanisms that allow it to use information contained in resource databases to determine which startup mechanisms, network interfaces, and protocols to use in different situations. • CAVEcomm and MPICH
Parallel tools discussion • A significant difficulty revealed by the I-WAY experiment related to the mechanisms used to generate and maintain the configuration information used by Nexus. • Automatic discovery techniques.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
File systems • I-WAY–like systems introduce three related requirements with a file-system flavor. 1. Many users require access to various status data and utility programs at many different sites. 2. Users running programs on remote computers must be able to access executables and configuration data at many different sites. 3. Application programs must be able to read and write potentially large data sets. • The I-Soft system supported only the first of these requirements.
File Systems (cont.) • An AFS cell (with three servers for reliability) was deployed and used as a shared repository for I-WAY software, and also to maintain scheduler status information.
Outline • Introduction • The I-WAY network • I-Way infrastructure Point of presence machines Scheduler Security Parallel programming tools File systems • Conclusions
Conclusions • SC’95 • Further I-WAY–like systems.