1 / 35

CprE / ComS 583 Reconfigurable Computing

CprE / ComS 583 Reconfigurable Computing. Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #15 – Midterm Review. Project Proposals. Group 1 – FPGA Implementation of Frequency-Domain Audio Effects Processor Five-band equalizer

raja
Download Presentation

CprE / ComS 583 Reconfigurable Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CprE / ComS 583Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #15 – Midterm Review

  2. Project Proposals • Group 1 – FPGA Implementation of Frequency-Domain Audio Effects Processor • Five-band equalizer • Frequency shifter CprE 583 – Reconfigurable Computing

  3. Project Proposals (cont.) • Group 2 – Transparent FPGA-Based Network Analyzer • Layer I pass-through • Layer II passive analyzer CprE 583 – Reconfigurable Computing

  4. Project Proposals (cont.) • Group 3 – FPGA-Based Library Design for Linear Algebra Applications • Floating-point sparse matrix-vector multiplication • Floating-point banded matrix-vector multiplication • Floating-point lower-upper matrix decomposition CprE 583 – Reconfigurable Computing

  5. Project Proposals (cont.) • Group 4 – An Improved Approach of Configuration Compression for FPGA-Based Embedded Systems • Improved compression algorithms • LUT-reordering techniques CprE 583 – Reconfigurable Computing

  6. Project Proposals (cont.) • Others Projects: • Group 5 – FPGA Ternary Data Conversion • Group 6 – Analysis of Sobel Edge Detection Implementations • Group 7 – Design and Analysis of Artificial Neural Networks on FPGAs • Reminders: • 11/16 – Project Updates (10 minutes) • 12/5-12/7 – Final Presentations (25 minutes) • 12/15 – Final Reports CprE 583 – Reconfigurable Computing

  7. Midterm Review • Using the Silicon PE PE PE $ PE PE MMX PE PE PE SSE PE PE PE FFT AES $ $ MPP More Cache CISC PE Reconfigurable Fabric PE PE $ $ Superscalar Vector Reconfigurable Processor CprE 583 – Reconfigurable Computing

  8. Computational Density (Qualitative) Actel ProASIC Intel Pentium 4 • FPGAs can complete more work per unit time than a processor or DSP: • Less instruction overhead • More active computation onto the same silicon area (allows for more parallelism) • Can control operations at the bit level (as opposed to word level) CprE 583 – Reconfigurable Computing

  9. Workstation Standalone Processing Unit Coprocessor Attached Processing Unit CPU Memory Caches I/O Interface FU Coupling in a Reconfigurable System • Many places to put reconfigurable computing components • Most implementations involve multiple discrete devices • How should these devices be connected together? CprE 583 – Reconfigurable Computing

  10. IOB IOB IOB IOB IOB CLB CLB CLB CLB CLB IOB IOB CLB CLB CLB CLB CLB IOB IOB CLB CLB CLB CLB CLB IOB IOB CLB CLB CLB CLB CLB IOB IOB CLB CLB CLB CLB CLB IOB IOB IOB IOB IOB IOB IOB Generic FPGA Architecture • FPGA = Field-Programmable Gate Array • Input/Output Buffers (IOBs) • Configurable Logic Blocks (CLBs) • Programmable interconnect mesh Island-style FPGA architecture CprE 583 – Reconfigurable Computing

  11. FPGA Technology • Various FPGA programming technologies (Anti-fuse, (E)EPROM, Flash, SRAM): • SRAM most popular CprE 583 – Reconfigurable Computing

  12. ... A0A1 A2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 255 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 0 1 0 0 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 0 1 0 ... 1 0 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 0 1 1 1 1 0 1 0 1 0 1 1 0 0 1 0 0 0 1 1 1 1 0 0 1 1 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 ... LUTs and Digital Logic • k inputs  2k possible input values • k-LUT corresponds to 2k x 1 bit memory • Truth table is stored • 22k possible functions – O(22k / k!) unique F = A0A1A2 + Ā0A1Ā2 + Ā0 Ā1 Ā2 CprE 583 – Reconfigurable Computing

  13. Architectural Issues [AhmRos04A] • What values of N, I, and K minimize the following parameters? • Area • Delay • Area-delay product • Assumptions • All routing wires length 4 • Fully populated IMUX • Wiring is half pass transistor, half tri-state CprE 583 – Reconfigurable Computing

  14. (1, 2) (1) A B A B Op ALU 2-LUT Out Out FPGA Arithmetic • Traditional microprocessors, DSPs, etc. don’t use LUTs • Instead use a w-bit Arithmetic and Logic Unit (ALU) • Carry connections are hard-wired • No switches, no stubs, short wires (2) (1) AND2 OR2 XOR2 A B Cin 3-LUT 3-LUT Sum Cout / Cin A B (2) ADD SUB CMP 3-LUT 3-LUT Sum Cout CprE 583 – Reconfigurable Computing

  15. FPGA Arithmetic (cont.) • Hard-wired carry logic support Altera FLEX 8000 Xilinx XCV4000 CprE 583 – Reconfigurable Computing

  16. X3 X2 X1 X0 Arithmetic (cont.) Y0 • Carry save multiplication Y1 X2 X1 X3 X0 + + + + Y2 + + + + Y3 + + + + Z2 Z1 Z0 CprE 583 – Reconfigurable Computing

  17. LUT-Based Constant Multipliers 10101011 x NNNNNNNN AAAAAAAAAAAA (N * 1011 (LSN)) + BBBBBBBBBBBB (N * 1010 (MSN)) SSSSSSSSSSSSSSSS Product • Constants can be changed in the LUTs to program new multipliers N0–N7 A0–A11 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT + N0–N7 S0–S15 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT 4-LUT B4–B15 CprE 583 – Reconfigurable Computing

  18. Capacity Trends Virtex-5 550 MHz 24M gates* Virtex-II Pro 450 MHz 8M gates* Virtex-4 500 MHz 16M gates* Virtex-II 450 MHz 8M gates Spartan-3 326 MHz 5M gates Virtex-E 240 MHz 4M gates Xilinx Device Complexity Virtex 200 MHz 1M gates XC4000 100 MHz 250K gates Spartan-II 200 MHz 200K gates Spartan 80 MHz 40K gates XC3000 85 MHz 7.5K gates XC5200 50 MHz 23K gates XC2000 50 MHz 1K gates 1985 1987 1991 1995 1998 1999 2000 2002 2003 2004 2006 Year CprE 583 – Reconfigurable Computing

  19. Splash 1 Architecture VME Bus VSB Bus Interface Interface FIFO IN Control FIFO OUT F3 F2 F1 F0 F31 F30 F29 F28 M3 M2 M1 M0 M31 M30 M29 M28 M4 M5 M6 M7 M24 M25 M26 M27 F4 F5 F6 F7 F24 F25 F26 F27 F11 F10 F9 F8 F23 F22 F21 F20 M11 M10 M9 M8 M23 M22 M21 M20 M12 M13 M14 M15 M16 M17 M18 M19 F12 F13 F14 F15 F16 F17 F18 F19 CprE 583 – Reconfigurable Computing

  20. FPGA-based Router • FPX module contains two FPGAs • NID – network interface device • Performs data queuing • RAD – reprogrammable application device • Specialized control sequences CprE 583 – Reconfigurable Computing

  21. A B C D E F G H I Mesh Topology • Chips are connected in a nearest-neighbor pattern • Simplicity is key • Linear array is essentially a 1-dimensional mesh CprE 583 – Reconfigurable Computing

  22. A B C D W X Y Z Other Topologies • Crossbar topology: • Devices A-D are routing only • Gives predictable performance • Potential waste of resources for near-neighbor connections CprE 583 – Reconfigurable Computing

  23. Logic Emulation • Emulation takes a sizable amount of resources • Compilation time can be large due to FPGA compiles CprE 583 – Reconfigurable Computing

  24. Systolic Architectures • Goal – general methodology for mapping computations into hardware (spatial computing) structures • Composition: • Simple compute cells (e.g. add, sub, max, min) • Regular interconnect pattern • Pipelined communication between cells • I/O at boundaries x + x min x x c CprE 583 – Reconfigurable Computing

  25. Finite Impulse Response • Systolic • Memory bandwidth per output – 2 • O(1) cycles per output • O(k) hardware • Sequential • Memory bandwidth per output – 2k+1 • O(k) cycles per output • O(1) hardware xi x x x x w1 w2 w3 w4 + + + + yi CprE 583 – Reconfigurable Computing

  26. Matrix-Vector Product t = 4 a41 a23 a23 a14 – t = 3 a31 a22 a13 – – t = 2 a21 a12 – – – t = 1 a11 – – – – x1 x2 x3 x4 xn … y1 t = n y2 t = n+1 y3 t = n+2 y4 t = n+3 CprE 583 – Reconfigurable Computing

  27. LUT0 LUT4 LUT1 FF1 LUT5 LUT2 FF2 LUT3 Circuit Netlist and Mapping CprE 583 – Reconfigurable Computing

  28. Placing and Routing FPGA Programmable Connections CprE 583 – Reconfigurable Computing

  29. A AeqB B Next Steps • VHDL / VHDL for Synthesis LIBRARY ieee ; USE ieee.std_logic_1164.all ; ENTITY implied IS PORT ( A, B : IN STD_LOGIC ; AeqB : OUT STD_LOGIC ) ; END implied ; ARCHITECTURE Behavior OF implied IS BEGIN PROCESS ( A, B ) BEGIN IF A = B THEN AeqB <= '1' ; END IF ; END PROCESS ; END Behavior ; CprE 583 – Reconfigurable Computing

  30. HW/SW Co-Design ARMulator Modelsim ARM core simulator ARMulator API Modelsim FLI HDL simulator AHB Slave I/F ARM Core Comm. Buffer Socket Handler Comm. Buffer AHB Slave I/F SOCKET #1 AMBA AHB Master I/F Cache ASIC / FPGA Mem. Access Socket Handler AHB Master I/F SOCKET #2 ARM Local Memory Shared Memory AHB Slave I/F CprE 583 – Reconfigurable Computing

  31. Multi-Context FPGAs CprE 583 – Reconfigurable Computing

  32. Function Unit Architectures • RaPiD: Reconfigurable Pipelined Datapath • Linear array of function units • Function type determined by application • Function units are connected together as needed using segmented buses • Data enters the pipeline via input streams and exits via output streams CprE 583 – Reconfigurable Computing

  33. High-Level Compilation C Program C Libraries on various Targets SUIF frontend Directives and Automation HW / SW Partitioner C to RTL VHDL/Verilog C to RTL VHDL/Verilog SUIF to GCC VHDL to FPGA Synthesis VHDL to ASIC Synthesis GCC compiler for embedded Object code for Embedded (SA) Binaries for FPGAs (Xilinx) Chip layouts (0.18u TSMC) CprE 583 – Reconfigurable Computing

  34. Other Topics? • Second course survey next week • Provide general feedback, suggest additional topics CprE 583 – Reconfigurable Computing

  35. Midterm Exam • Three questions • Review • Analysis • Extension • Any paper mentioned in class is fair game • Due in 48 hours (10/12 – 2:00pm) • No class on Thursday! • Some restrictions: • Work alone • Can ask if something is unclear (“what does this mean?” questions, not “how do I do this?” questions) • No late submissions – strict WebCT deadline CprE 583 – Reconfigurable Computing

More Related