1 / 14

FPGA Coprocessor for Simulation of Neural Networks Using Compressed Matrix Storage

Chapter 11 in Systems and Circuit Design for Biologically-Inspired Intelligent Learning. FPGA Coprocessor for Simulation of Neural Networks Using Compressed Matrix Storage. Richard Dorrance Literature Review: 1/11/13. Overview. Binary neuron model for unsupervised learning

dgillette
Download Presentation

FPGA Coprocessor for Simulation of Neural Networks Using Compressed Matrix Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 11 in Systems and Circuit Design for Biologically-Inspired Intelligent Learning FPGA Coprocessor for Simulation of Neural Networks Using Compressed Matrix Storage Richard Dorrance Literature Review: 1/11/13

  2. Overview • Binary neuron model for unsupervised learning • arranged into minicloumns and macrocloumns • Sparse matrix vector multiplication (SpMxV) • compressed row format • software optimizations • FPGA coprocessor • exploiting matrix characteristics for a “novel” architecture • benchmarks and scalability (or lack there of)

  3. Neural Network Example

  4. Binary Neuron Model • Modeled as McCulloch-Pitts neurons (i.e. binary): • only 2 states: firing (1) and not firing (0) • fixed time step t • refractory period of one time step

  5. Macrocolumn Dynamics

  6. Cortical Macrocolumn Connection Matrix

  7. Feature Extraction: Bars Test

  8. Compress Row Format • Sparse matrix representation (w/ 3 vectors): • value • column index • row pointer

  9. SpMxV is the Bottleneck • Theoretically SpMxV should be memory bound • Reality: lots of stalling for data due to irregular memory access patterns • Coding Strategies: • cache prefetching • matrix reordering • register blocking (i.e. N smaller, dense matrices) • CPU: 100 GFLOPS (theoretical), 300 MFLOPS (reality) • GPU: 1 TFLOPS (theoretical), 10 GFLOPS (reality)

  10. Simplifications and Optimizations • Matrix elements are binary: value vector is dropped • Strong block-like structure to matrices: compress column index vector

  11. FPGA Coprocessor Architecture

  12. Resource Usage

  13. Scalability

  14. Conclusions • SpMxV is the major bottleneck to simulating neural networks • Architectures of CPUs and GPUs limit performance • FPGAs can increase performance efficiency • Specialized formulation of SpMxV  limited scalability

More Related