1 / 12

An Error Correction Code to Address Neutron Single Event Upsets in Semiconductor Memory

An Error Correction Code to Address Neutron Single Event Upsets in Semiconductor Memory. David W. Jensen, Ph.D. Advanced Computing Systems Rockwell Collins 400 Collins Road NE, MS 108-206 Cedar Rapids, Iowa 52498 dwjensen@rockwellcollins.com. Introduction.

clement
Download Presentation

An Error Correction Code to Address Neutron Single Event Upsets in Semiconductor Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Error Correction Code to Address Neutron Single Event Upsets in Semiconductor Memory David W. Jensen, Ph.D. Advanced Computing Systems Rockwell Collins 400 Collins Road NE, MS 108-206 Cedar Rapids, Iowa 52498 dwjensen@rockwellcollins.com April 23, 2002

  2. Introduction • Why concerned about Neutron Single Event Upsets (NSEUs)? • Mitigation techniques for microprocessor technology • Error correction codes • Block code to address Singe Event Upsets (SEUs) and Multiple Bit Upsets (MBUs) • Summary April 23, 2002

  3. Avionics Platforms • Avionics electronics and communication • Upgrades to existing systems • High altitude and latitude • 55,000 feet • Polar route • Life critical and national security critical operation “No single fault, however improbable, shall jeopardize the continued safe continue flight and landing …” April 23, 2002

  4. Only Avionics? • “The trend of current design practice suggests that device density will continue to halve every two years and that memory size will continue to quadruple every two years as well. These factors, along with the ever-decreasing power levels, will cause further reduced energy thresholds in microelectronics semiconductor circuits in the years to come. This suggests that SEU effects are likely to increase 10-fold every five years. For these reasons, it is conceivable that all computer devices - not just those at altitude - will need to be protected from SEU effects within the next 10 - 15 years” • John H. Sohn • Rockwell Collins, Air Transport Systems April 23, 2002

  5. Mitigation Goals • 20-year history of microprocessor development • World’s first 16-bit CMOS microprocessor • World’s first direct execution Java Virtual Machine (JVM) microprocessor • Avionics quality • Identifying design mitigation techniques for commercial fabrication of NSEU tolerant microprocessors • Goal : • Address SEU and MBU in microprocessor elements through design techniques instead of special fabrication technique • Initial focus - Device level approaches for soft error upsets • Future focus - System level approaches for hard errors, latchup, burnout, and ruptures • Total dose issues will continue to require special fabrication techniques April 23, 2002

  6. Proc Technology Mitigation Techniques April 23, 2002

  7. Error Detection and Correction Code • Error Detection and Correction (EDAC) • Hamming Created Correction Concept in 1950’s • Provides correction of errors instead of detection • Still used today April 23, 2002

  8. Multiple Bit Upsets must be Addressed • Scaling of semiconductor device geometries causing MBUs • Single Error Correction / Double Error Detect codes ineffective for these MBUs • Created block code to efficiently address these physically adjacent errors April 23, 2002

  9. Block Code Generation Technique • Adjacent errors always produce a syndrome that is the exclusive-or (xor) of the block code columns in error. • Simple set of guidelines to develop block code matrices that can correct double and triple adjacent errors: • Identify a unique set of syndromes to identify the column bits in error, the double adjacent columns in error, and the triple adjacent columns in error • Compute the double and triple error syndrome values by exclusive-or’ing values of the corresponding single bit column’s syndrome values • Ensure that no duplicate syndromes exist for the single, double adjacent and triple adjacent errors • Syndrome generation and correction logic comparable to conventional EDAC designs April 23, 2002

  10. Block Code Generation • Implement software to perform generation of code and checking of rules • Genetic algorithm approach used to generate block codes • Search technique used to generate block codes • Search technique illustrated April 23, 2002

  11. Adjacent Error Correction Efficiency • Acronyms: • SEC – Single Error Correction • DEC – Double Error Correction • DAEC – Double Adjacent Error Correction • TAEC – Triple Adjacent Error Correction • Adjacent error correction nearly as efficient as SEC for 32 bit and 64 bit data April 23, 2002

  12. Summary • Rockwell Collins has an ongoing interest in Single Event Effects (SEE) • Immediate concern for future avionics products • Long term concern for land based products • Several possible research threads in this area • Concerned over issue of SEE in current designs and expect the problem to grow worse in the future • Combining multiple mitigation techniques could enable an NSEU-tolerant, commercially-fabricated microprocessor • Presented efficient error correction block code to address SEUs and MBUs in semiconductor memory April 23, 2002

More Related