1 / 21

The TickerTAIP Parallel RAID Architecture

The TickerTAIP Parallel RAID Architecture. P. Cao, S. B. Lim S. Venkatraman, J. Wilkes HP Labs. RAID Architectures. Traditional RAID architectures have A central RAID controller interfacing to the host and processing all I/O requests Disk drives organized in strings

locke
Download Presentation

The TickerTAIP Parallel RAID Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The TickerTAIPParallel RAID Architecture P. Cao, S. B. LimS. Venkatraman, J. WilkesHP Labs

  2. RAID Architectures • Traditional RAID architectures have • A central RAID controller interfacing to the host and processing all I/O requests • Disk drives organized in strings • One disk controller per disk string (mostly SCSI)

  3. Limitations • Capabilities of RAID controller are crucial to the performance of RAID • Can become memory-bound • Presents a single point of failure • Can become abottleneck • Having a spare controlleris an expensive proposition

  4. Our Solution • Have a cooperating set ofarray controller nodes • Major benefits are: • Fault-tolerance • Scalability • Smooth incremental growth • Flexibility: can mix and match components

  5. TickerTAIP Hostinterconnects Controller nodes

  6. TickerTAIP ( I) A TickerTAIP array consists of: • Worker nodes connected with one or more local disks through a bus • Originator nodes interfacing with host computer clients • A high-performance small area network: • Mesh based switching network (Datamesh) • PCI backplanes for small networks

  7. TickerTAIP ( II) • Can combine or separate worker and originator nodes • Parity calculations are done in decentralized fashion: • Bottleneck is memory bandwidth not CPU speed • Cheaper than having faster paths to a dedicated parity engine

  8. Design Issues (I) • Normal-mode reads are trivial to implement • Normal mode writes: • three ways to calculate the new parity: • full stripe: calculate parity from new data • small stripe: requires at least four I/Os • large stripe: if we rewrite more than half a stripe, we compute the parity by reading the unmodified data blocks

  9. Design Issues (II) • Parity can be calculated: • At originator node • Solely parity: at the parity node for the stripe • Must ship all involved blocks to party node • At parity: same as solely parity but partial results for small stripe writes are computed at worker node and shipped to parity node • Occasions less traffic than solely parity

  10. Handling single failures (I) • TickerTAIP must provide request atomicity • Disk failures aretreated as in standard RAID • Worker failures: • Treated like disk failures • Detected by time-outs(assuming fail-silent nodes) • A distributed consensus algorithm reaches consensus among remaining nodes

  11. Handling single failures (II) • Originator failures: • Worst case is failure of a originator/worker node during a write • TickerTAIP uses a two-phase commit protocol: • Two options: • Late commit • Early commit

  12. Late commit/Early commit • Late commitonly commits after parity has been computed • Only the writes must be performed • Early commitcommits as soon as new data and old data have been replicated • Somewhat faster • Harder to implement

  13. Handling multiple failures • Power failures during writes can corrupt stripe being written: • Use UPS to eliminate them • Must guarantee that some specific requests will always be executed in a given order: • Cannot write data blocks before updating the i-nodes containing block addresses • Usesrequest sequencingto achieve partial write ordering

  14. Request sequencing (I) • Each request • Is given a unique identifier • Can specify one or more requests on whose previous completion it depends(explicit dependencies) • TickerTAIP adds enough implicit dependencies to prevent concurrent execution of overlapping requests

  15. Request sequencing (II) • Sequencing is performed by acentralized sequencer • Several distributed solutions were considered but not selected because of the complexity of the recovery protocols they would require

  16. Disk Scheduling Not discussed in class in Fall 2005 • Considered • First come first served (FCFS): implemented in the working prototype • Shortest seek time first (SSTF): • Shortest access time first (SATF):Considers both seek time and rotation time • Batched nearest neighbor (BNN):Runs SATF on all reuests in queue

  17. Evaluation (I) • Based upon • Working prototype • Used seven relatively slow Parsytec cards each with its own disk drive • Event-driven simulator was used to test other configurations: • Results were always within 6% of prototype measurements

  18. Evaluation (II) • Read performance: • 1MB/s links are enough unless the request sizes exceed 1MB

  19. Evaluation (III) • Write performance: • Large stripe policyalways results in aslight improvement • At-parity significantly better than at-originator especially for link speeds below 10MB/s • Late commit protocol reduces throughput by at most 2% but can increase response time by up to 20% • Early commit protocol is not much better • TickerTAIP always outperforms a comparable centralized RAID architecture • best disk scheduling policy is Batched Nearest Neighbor

  20. Evaluation (IV) • TickerTAIP always outperforms a comparable centralized RAID architecture • Best disk scheduling policy is Batched Nearest Neighbor(BNN)

  21. Conclusion • Can use physical redundancy to eliminate single points of failure • Can use eleven 5 MIPS processors instead of single 50 MIPS • Can use off-the-shelf processors for parity computations • Disk drives remain the bottleneck for small request sizes

More Related