1 / 21

Disk Directed I/O for MIMD Multiprocessors

Disk Directed I/O for MIMD Multiprocessors. David Kotz Department of Computer Science Dartmouth College. Overview. Introduction Collective I/O Implementation Experiments Results Example Conclusions Questions. Introduction. Scientific applications Traditional I/O Disk directed I/O.

ike
Download Presentation

Disk Directed I/O for MIMD Multiprocessors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Disk Directed I/O for MIMD Multiprocessors David Kotz Department of Computer Science Dartmouth College

  2. Overview • Introduction • Collective I/O • Implementation • Experiments • Results • Example • Conclusions • Questions

  3. Introduction • Scientific applications • Traditional I/O • Disk directed I/O

  4. MIMD Architecture With CPs and IOPs ( Message Passing)

  5. Collective I/O • Drawbacks of traditional file system • Independent requests • Separate file-system call • Data declustering • Collective I/O Interface • CPs cooperate • Provides a high level interface • Enhances performance

  6. Collective I/O ( Contd..) • Implementation alternatives • Traditional caching • Two phase I/O • Disk directed I/O • Traditional Caching No explicit interface Each application calls IOP • Two phase I/O CPs collectively determine and carry out the optimized approach Data layout must be same in the processors and in the disks

  7. Collective I/O (Contd..) • Traditional Caching

  8. Collective I/O ( Contd.. ) • Two Phase I/O

  9. Collective I/O ( Contd.. ) • Disk-directed I/O

  10. Collective I/O ( Contd.. ) • Disk-directed I/O • The I/O can confirm not only to the logical layout but also to the physical layout. • If the disks are RAIDS, the I/O can be organized to perform full stripe writes for maximum performance. • Only one I/O request to each IOP. • There is no communication among the IOPs • Disk scheduling is improved by sorting the block list for each disk • Two buffers per CP per disk per file

  11. Implementation • Files were striped across all disks , block by block • Each IOP served more than one disks • Message passing and DMA • Each message was encoded • Each request contained reply action • Memget and Memput messages • Proteus simulator on DEC-5000 workstation

  12. Implementation ( Contd.. ) • Simulation parameters

  13. Implementation ( Contd.. ) • Implementing DDIO • IOP creates new thread for each request • Thread computes disk blocks, sorts based on location and informs to disk threads • Allocates two one-block buffers for each local disk • Creates a thread to manage each buffer • Implementing traditional caching • CPs didn’t cache or prefetch data • CPs send concurrent requests to relevant IOPs • Each IOP mainatained double buffer to satisfy requests from each CP to each disk

  14. Experiments • Different configurations • File access patterns • Disk layout • Number of CPs • Number of IOPs • Number of disks • File and disk layout • 10 MB file was striped across disks block by block • Both contiguous and random layouts were used

  15. Experiments ( Contd.. ) • 1D and 2D matrices are used • Access patterns • NONE -each dimension of the array could be mapped to entirely one CP • BLOCK-distributed among CPs in contiguous blocks • CYCLIC-distributed round-robin among the CPs • Record size of 8 bytes and 8192 bytes are used • HPF array distribution

  16. Experiment ( Contd.. ) • Contiguous disk layout

  17. Example • LU Decomposition • In solving linear systems of equations • N x N matrix • Decomposed into L-lower triangular and U-upper triangular • LU=M • Columns are stored in processor’s memory • Each processor’s subset of columns is called “slab”

  18. Example ( Contd.. ) • Performance measurement • 8 CPs, 8 IOPs and 8 disks one for each IOP • 4MB matrix data • Slab size 16, 32 or 128 columns • Random or contiguous layout • Block size 1KB, 4KB or 8KB • Traditional file system used 128 blocks of total cache • Disk directed file system used 16 blocks of total buffer space • Results • DDIO always improved the performance of the LU decomposition when both contiguous and random layouts are used

  19. Related work • PIFS • Data flow is controlled by the file system rather than by the application. • Jovian collective-I/O library • Combines fragmented requests from many CPs into larger requests that can be passed to the IOPs. • Transparent Informed prefetching ( TIP ) • Enables applications to submit detailed hints about their future file activity to the file system, which can use the hints for accurate, aggressive prefetching. • TickerTAIP RAID controller • Uses collaborative execution similar to disk directed file system.

  20. Conclusions • Disk-directed I/O avoided many of the pitfalls inherent in the traditional caching method, such as thrashing, extraneous disk-head movements etc. • Presorts disk requests to optimize head movement and had smaller buffer requirements. • It is most valuable when making large, collective transfers of data between multiple disks and multiple memories.

  21. Questions • What is collective I/O ? • What are the advantages of disk-directed I/O ?

More Related