1 / 28

Anthony J. Yu August 15, 2005

Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy. Anthony J. Yu August 15, 2005. Outline. Introduction and motivation Previous works New architectures Coarse-grain redundancy (CGR) Fine-grain redundancy (FGR) Comparisons

danika
Download Presentation

Anthony J. Yu August 15, 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Defect Tolerancefor Yield Enhancementof FPGA InterconnectUsing Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005

  2. Outline • Introduction and motivation • Previous works • New architectures • Coarse-grain redundancy (CGR) • Fine-grain redundancy (FGR) • Comparisons • Conclusions

  3. Introduction and Motivation • Scaling introduces new types of defects • Number of defects expected to increase as chip density increases • As a result, chip yield is on the decline • FPGAs are mostly interconnect • To improve yield (and revenue), we must tolerate multiple interconnect defects

  4. General Defect Tolerant Techniques • Defect-tolerant techniques minimize impact (cost) of manufacturing defects • FPGA defect-tolerance can be loosely categorized into three classes: • Software Redundancy – use CAD tools to map around the defects • Hardware Redundancy – incorporate spare resources to assist in defect correction (eg. Spare row/column) • Run-time Redundancy – protection against transient faults such as SEUs (eg. TMR)

  5. Previous work – 1 – Xilinx • Xilinx’s Defect-Tolerant Approach • Customer (knowingly) purchases “less that perfect” parts • Customer gives Xilinx configuration bitstream • Xilinx tests FPGA devices against bitstream • Sells FPGA parts that “appear” perfect • Defects avoid the bitstream • Limitation: • Chips work only with given bitstream – no changes!

  6. Previous work – 2 – Altera • Altera’s Defect-Tolerant Approach • Customer purchases “seemingly perfect” parts • Make defective resources inaccessible to user • Coarse-grain architecture • Spare row and column in array (like memories) • Defective row/column must be bypassed • Use the spare row/column instead • Limitation: • Does not scale well (multiple defects)

  7. Objectives • Problem • FPGA yield is on decline because of aggressive technology scaling • Important objectives to improve yield: • Tolerate interconnect defects (dominates area) • Tolerate multiple defects (future trend) • Preserve timing (no timing re-verification) • Fast correction time (production use)

  8. Contributions • New fine-grain redundancy architecture • Coarse-grain architecture with multiple spare rows and columns • Detailed evaluation of fine-grain and coarse-grain redundancy • Area, delay, yield estimates • Publications: • Non-redundant architecture paper, at FPT’04 • Fine-grain architecture paper, to appear in FPL’05 • Yield comparison paper, to appear in FPT’05

  9. Non-redundant Interconnect Switch HIGH-LEVELMODEL OLD (bidirectional) MODERN (directional)

  10. Coarse-grain Redundancy (CGR)

  11. So…what’s wrong with it?

  12. Improving yield for CGR –Adding Multiple Global Spares • Add multiple global spare to traditional CGR • Global spares can be used to repair any defective row/column in the array • Wire extensions are now longer

  13. Yield Impact of Multiple Global Spares

  14. Increasing Area+Delay Overhead MORE SPARES  MORE MUX OVERHEAD IN EVERY SWITCH ELEMENT NO SPARES 2 GLOBAL SPARES 4 GLOBAL SPARES MAY BE IMPRACTICAL !!! 1 GLOBAL SPARE

  15. Fine-grain Redundancy (FGR) – Avoidance by Shifting

  16. Implementation Overview

  17. FGR Switch Element Details Defect Downstream Switch Block Upstream Switch Block

  18. FGR Implementation Comparison

  19. FGR Architectural Summary • Several implementations of FGR evaluated: • Implementation with best yield improvement (EM22) • Area +50%, delay + 20% • Implementation with lowest yield improvement (EN11) • Area +35%, delay +25% • Perfect chips can be sold as interconnect-enhanced FPGAs • Allow router to use spare routing resources (muxes, tracks) • Gives more routing flexibility • True area and delay overhead are 10-20% and 5-25%

  20. Comparison between FGR and CGR – FGR Tolerates Tens of Defects

  21. Estimated Area overhead at equal yield (80%) * CGR-G1 can only tolerate 1-2 defects

  22. Limitations of Study & Architectures • FGR • Does not tolerate defects in the logic • Cannot tolerate clustered defects • Requires a detailed fault map • CGR • Assumes that all defects can be corrected with a single row/column • Bypass circuitry is approximated

  23. Conclusions • CGR is effective for 1 or 2 defects • FGR meets desired objectives: • Tolerates multiple randomly distributed defects • Defect correction does not perturb timing • Tolerates an increasing number of defects as array size increases • Correction can be applied quickly • FGR potentially capable of correcting crosstalk faults, but is not explored in thesis

  24. Contributions • New fine-grain redundancy architecture • Coarse-grain architecture with multiple spare rows and columns • Detailed evaluation of fine-grain and coarse-grain redundancy • Detailed circuit-level design  improved area, delay estimates • Yield comparison • Publications: • Non-redundant architecture paper, at FPT’04 • Fine-grain architecture paper, to appear in FPL’05 • Yield comparison paper, to appear in FPT’05

  25. Thank you! anthonyy@ece.ubc.ca

  26. Improving yield for CGR –Adding Multiple Local Spares • Divide FPGA into subdivisions • Each subdivision has localspare(s) • Distributes spares across chip • Reduces mux area overhead(of Global scheme) • Limitation: • Spare(s) can only repair defect within the subdivision

  27. Yield Impact of Multiple Local Spares(not as good as Global with same # spares)

  28. Summary • As the density of FPGAs increase, they become increasingly susceptible to manufacturing defects • Defect-tolerant techniques alleviate this growing problem • Depending on the desired level of protection, we can apply different techniques • At low defect rates, the coarse-grain spare row and column approach has lower overhead than the fine-grain approach • At the same area overhead, the fine-grain approach can tolerate more defects than the spare row and column approach

More Related