270 likes | 336 Views
Expediting GA-Based Evolution Using Group Testing Techniques for Reconfigurable Hardware 1. ReConFig’06 San Luis Potosi - Mexico. Rashad S. Oreifej, Carthik A. Sharma, and Ronald F. DeMara University of Central Florida. 1. Research support in-part by NSF grant CRCD: 0203446. Evolvable Hardware.
E N D
Expediting GA-Based Evolution Using Group Testing Techniques for Reconfigurable Hardware1 ReConFig’06San Luis Potosi - Mexico Rashad S. Oreifej, Carthik A. Sharma, and Ronald F. DeMaraUniversity of Central Florida 1. Research support in-part by NSF grant CRCD: 0203446
Evolvable Hardware Evolutionary Design: • Start with available CLBs and IOBs • Implement a design using Genetic Operators etc [Fogarty97] • Limited or no ability to re-design to account for suspected faulty resources • Evolutionary Regeneration: • Start with an existing pool of designs • Some existing configurations may use faulty resources • Eliminate use of suspected faulty resources • Genetic Operators can be applied to refurbish designs [Vigander01]
Previous Work • Pre-compiled Column-Based Dual FPGA architecture [Mitra04] • Autonomous detection, repair by shifting pre-compiled columns • Isolation using distributed CED-checkers and “blind” reconfiguration attempts • Overview of Combinatorial Group Testing and Applications [Du00] • Provides taxonomy and general algorithms for applying CGT • Examples of CGT applications: DNA clone library filtering, vaccine screening, computer fault diagnosis, etc. • CGT Enhanced Circuit Diagnosis [Kahng04] • Present doubling, halving etc for circuit fault diagnosis using BIST, CGT • Requires ability to test resources individually • Chinese Remainder Sieve technique [Eppstein05] • Efficient non-adaptive and two-stage CGT based on prime number driven test formation • Improved algorithms for practical problem sizes (n < 1080) with small number of defectives (d < 4)
Individual (Chromosome) GENE Genetic Algorithms & Evolvable Hardware GAs are strong candidates for implementing system refurbishment: • They implement guided trial-and-error search using principles of Darwinian evolution • Iterative selectionenforces “survival of the fittest” • Genetic operators - mutation, crossover, …- can be used to refurbish designs • Hypothesis:Information regarding resource performance can expedite GA-based refurbishment GAs frequently use strings of 1s and 0s to represent candidate solutions • FPGA Configuration File is a String of 1s and 0s
Conventional vs. CGT-Pruned GA • Conventional GA: Searches the whole space to evolve a working design or repair • Information about resource suitability may accelerate search • CGT-Pruned GA: Prefers resources of higher fitness to evolve a working design or repair. Q. How to obtain resource fitness information? A. Using Group Testing Techniques. • Combinatorial Group Testing identifies a decreasing group of “defectives” by iterative refinement • Tests on subsets of suspects • Is expected to take less time. “Faster Design and Faster Repair”
CGT-Pruned Refurbishment • Isolate and Avoid suspect resources from being used • Hypothesis: • CGT-Pruned GA Repair evolves a full fitness circuit faster than Conventional GA Repair • Results show performance improvement in CGT-Pruned Repair
Achieving Refurbishment with Cell Swapping • Isolate and Swap suspect resources • Cell Swapping Operator • Copy suspect resource “Cell” configuration to another unused cell • GA searches for routing strategy to re-route interconnect to the previously-unused cell • Refurbishment with Cell Swapping • Swap suspect cells one by one and evaluate fitness until full fitness is evolved • If swapping all suspect cells does not realize complete refurbishment, then employ other GA operators
CGT-Pruned GA Design • Evolve the entire circuit design from scratch • Avoid suspect resources and take advantage of resource redundancy within the FPGA • CGT-Pruning outperforms Conventional GA-based techniques
Comparison of Performance – Number of Generations for Repair More than 70% of the experiments benefited substantially from resource information generated using CGT
Results Summary As opposed to Conventional GAs, CGT-Pruned GAs: • Completely refurbish configurations in 38% fewer generations • Design fully functional configurations in 16% fewer generations • Faulty resources are eliminated from • Pool of unused-resources in the case of repair as opposed to the pool of all-resourcesin the case of design. Repair complexity vs. Design complexity • Repair complexity << Design complexity • Repairs were realized in one-fifth of the time required for Design
Backup Slides • On following pages …
Motivation • Mission-critical Embedded Systems require high reliability and availability • Characteristics of Operating Environment may induce hardware failures: • Aging, Manufacturing Defects, …etc. • System Reliability: • Fault Avoidance. “Always Possible?”… No • Design Margin. “Always Adequate?”… No • Modular Redundancy. “Always Recoverable?”…No • Fault Refurbishment. “Highly Flexible?” … Yes … but technically challenging to achieve
Group Testing Techniques H [i,j] • Competitive Group Testing • Algorithm based on group testing methods • Use competition between configurations • Temporal information stored in H matrix • Successive intersection • Monitor health history of resources which presents resource fitness • Simulated using C programming language and GSL functions [Sharma-06] i,j Relative fitness of resource α 1/H [i,j]
Three Fast Runs of the CGT-pruned GA Repair GA evolves to a relatively very high fitness within the first few hundreds generations, but takes significantly more generations to reach the maximum fitness
References [1] Fogarty T. C., J. F. Miller, and P. Thomson, "Evolving Digital Logic Circuits on Xilinx 6000 Family FPGAs," in Proceedings of The 2nd Online Conference on Soft Computing, 23-27 June 1997. [2] Sverre Vigander, “Evolutionary Fault Repair in Space Applications”, Master’s Thesis, Dept. of Computer & Information Science, Norwegian University of Science and Technology (NTNU), Trondheim, 2001. [3] C. A. Sharma, R. F. DeMara, "A Combinatorial Group Testing Method for FPGA Fault Location", accepted to International Conference on Advances in Computer Science and Technology (ACST 2006), Puerto Vallarta, Mexico, January 23 - 25, 2006 [4] S. Mitra and E. J. McCluskey, “Which Concurrent Error Detection Scheme to Choose?,” in Proceedings of the International Test Conference 2000, p. 985, October 2000. [5] D. Du and F. K. Hwang. Combinatorial Group Testing and its Applications, volume 12 of Series on Applied Mathematics. World Scientific, 2000. [6] A. B. Kahng and S. Reda. “Combinatorial Group Testing Methods for the BIST Diagnosis Problem,” in Proceedings of the Asia and South Pacific Design Automation Conference, January 2004. [7] Keymeulen, D.; Zebulum, R.S.; Jin, Y.; Stoica, A..“Fault-Tolerant Evolvable Hardware Using Field-Programmable Transistor Arrays”, IEEE Transactions On Reliability, Vol. 49, No. 3, September 2000 [8] Lohn, J.; Larchev, G.; DeMara, R. “Evolutionary fault recovery in a Virtex FPGA using a representation that incorporates routing”, Parallel and Distributed Processing Symposium, 2003. Proceedings. International 22-26 April 2003 [9] Lach, J.; Mangione-Smith, W.H.; Potkonjak, M. “Low overhead fault-tolerant FPGA systems”, Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Volume 6, Issue 2, June 1998 [10] Miron Abramovici, John M. Emmert and Charles E. Stroud , “Roving Stars: An Integrated Approach To On-Line Testing, Diagnosis, And Fault Tolerance For Fpgas In Adaptive Computing Systems”, The Third NASA/DoD Workshop on Evolvable Hardware, Long Beach, Cailfornia 2001 [11] DeMara, R.F.; Kening Zhang. “Autonomous FPGA Fault Handling through Competitive Runtime Reconfiguration”, Evolvable Hardware, 2005. Proceedings. 2005 NASA/DoD Conference on 29-01 June 2005 [12] D. Eppstein, M. T. Goodrich, and D. S. Hirschberg. “Improved combinatorial group testing for realworld problem sizes”, In Workshop on Algorithms and Data Structures (WADS), Lecture Notes Comput. Sci. Springer, 2005. [13] J. F. Miller, P. Thomson, and T. Fogarty. “Designing Electronic Circuits Using Evolutionary Algorithms. Arithmetic Circuits: A Case Study”, In D. Quagliarella, J. Periaux, C. Poloni, and G. Winter, editors, Genetic Algorithms and Evolution Strategy in Engineering and Computer Science, pages 105--131. Morgan Kaufmann, Chichester, England, 1998.
Previous Work Fault Tolerant Design and Detection Characteristics ***Incorporates resource performance information
Previous Work Fault Recovery Characteristics
Our Goal:Autonomous FPGA Refurbishment increase availability without carrying pre-configured spares … Redundancy increases with amount of spare capacity restricted at design-time based on time required to select spare resource determined by adequacy of spares available (?) yes Refurbishment weakly-related to number recovery capacity variable at recovery-time based on time required to find suitable recovery affected by multiple characteristics (+ or -) yes everyday example spare tires can of fix-a-flat Overhead from Unutilized Spares weight, size, power Granularity of Fault Coverage resolution where fault handled Fault-Resolution Latency availability via downtime required to handle fault Quality of Repair likelihood and completeness Autonomous Operation fix without outside intervention
GA Success Stories Commercial Applications: • Nextel: frequency allocation for cellular phone networks -- $15M predicted savings in NY market • Pratt & Whitney: turbine engine design --- engineer: 8 weeks; GA: 2 days w/3x improvement • International Truck: production scheduling improved by 90% in 5 plants NASA:superior Jupiter trajectory optimization, antennas, FPGAs Koza:25 instances showing human-competitive performance such as analog circuit design, amplifiers, filters
Adaptive GA Design * Arithmetic mean for twenty experiments ** Standard Deviation for twenty experiments
CGT-Pruned GA Simulator • C++ based console application • Consists of: • Combinatorial Group Testing component • Uses Gnu Scientific Library (GSL) • Genetic Algorithm component • Object oriented architecture that models FPGA resources • Modes of Operation: • CGT-Pruned GA Repair • Use CGT to isolate suspect resources • Avoid use of suspect-faulty resource in design refurbishment process • CGT-Pruned GA Repair withCell Swapping • Swap suspect-faulty resources with previously unused resources to evolve a recovery • CGT-Pruned GA Design • Evolve a new working design while avoiding suspect resources