1 / 52

Introduction to FPGA Devices

Introduction to FPGA Devices. World of Integrated Circuits. Integrated Circuits. Full-Custom ASICs. Semi-Custom ASICs. User Programmable. PLD. FPGA. PAL. PLA. PML. LUT (Look-Up Table). MUX. Gates. Two competing implementation approaches. FPGA F ield P rogrammable G ate A rray.

wayde
Download Presentation

Introduction to FPGA Devices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to FPGA Devices ECE 645 – Computer Arithmetic

  2. World of Integrated Circuits Integrated Circuits Full-Custom ASICs Semi-Custom ASICs User Programmable PLD FPGA PAL PLA PML LUT (Look-Up Table) MUX Gates ECE 645 – Computer Arithmetic

  3. Two competing implementation approaches FPGA FieldProgrammable GateArray ASIC ApplicationSpecific IntegratedCircuit • bought off the shelf • and reconfigured by • designers themselves • designs must be sent • for expensive and time • consuming fabrication • in semiconductor foundry • no physical layout design; • design ends with • a bitstream used • to configure a device • designed all the way • from behavioral description • to physical layout ECE 645 – Computer Arithmetic

  4. Block RAMs Block RAMs What is an FPGA? Configurable Logic Blocks I/O Blocks Block RAMs ECE 645 – Computer Arithmetic

  5. Which Way to Go? ASICs FPGAs Off-the-shelf High performance Low development cost Low power Short time to market Low cost in high volumes Reconfigurability ECE 645 – Computer Arithmetic

  6. Other FPGA Advantages • Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower • Mistakes not detected at design time have large impact on development time and cost • FPGAs are perfect for rapid prototyping of digital circuits • Easy upgrades like in case of software • Unique applications • reconfigurable computing ECE 645 – Computer Arithmetic

  7. Major FPGA Vendors SRAM-based FPGAs • Xilinx, Inc. • Altera Corp. • Atmel • Lattice Semiconductor Flash & antifuse FPGAs • Actel Corp. • Quick Logic Corp. Share over 60% of the market ECE 645 – Computer Arithmetic

  8. Xilinx Programmable Logic Devices • Primary products: FPGAs and the associated CAD software • Main headquarters in San Jose, CA • Fabless* Semiconductor and Software Company • UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996} • Seiko Epson (Japan) • TSMC (Taiwan) ISE Alliance and Foundation Series Design Software ECE 645 – Computer Arithmetic

  9. Xilinx FPGA Families • Old families • XC3000, XC4000, XC5200 • Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. • High-performance families • Virtex (0.22µm) • Virtex-E, Virtex-EM (0.18µm) • Virtex-II, Virtex-II PRO (0.13µm) • Virtex-4 (0.09µm) • Low Cost Family • Spartan/XL – derived from XC4000 • Spartan-II – derived from Virtex • Spartan-IIE – derived from Virtex-E • Spartan-3 ECE 645 – Computer Arithmetic

  10. ECE 645 – Computer Arithmetic

  11. Xilinx FPGA Block Diagram ECE 645 – Computer Arithmetic

  12. CLB Structure ECE 645 – Computer Arithmetic

  13. CLB Slice Structure • Each slice contains two sets of the following: • Four-input LUT • Any 4-input logic function, • or 16-bit x 1 sync RAM • or 16-bit shift register • Carry & Control • Fast arithmetic logic • Multiplier logic • Multiplexer logic • Storage element • Latch or flip-flop • Set and reset • True or inverted inputs • Sync. or async. control ECE 645 – Computer Arithmetic

  14. LUT (Look-Up Table) Functionality • Look-Up tables are primary elements for logic implementation • Each LUT can implement any function of 4 inputs ECE 645 – Computer Arithmetic

  15. 5-Input Functions implemented using two LUTs • One CLB Slice can implement any function of 5 inputs • Logic function is partitioned between two LUTs • F5 multiplexer selects LUT ECE 645 – Computer Arithmetic

  16. 5-Input Functions implemented using two LUTs LUT LUT LUT LUT OUT ECE 645 – Computer Arithmetic

  17. Distributed RAM RAM16X1S D WE WCLK = O A0 A1 A2 A3 LUT LUT LUT RAM32X1S D WE WCLK A0 O A1 A2 A3 A4 or RAM16X2S D0 D1 WE = WCLK O0 A0 O1 RAM16X1D A1 A2 D A3 WE or WCLK A0 SPO A1 A2 A3 DPRA0 DPO DPRA1 DPRA2 DPRA3 • CLB LUT configurable as Distributed RAM • A LUT equals 16x1 RAM • Implements Single and Dual-Ports • Cascade LUTs to increase RAM size • Synchronous write • Synchronous/Asynchronous read • Accompanying flip-flops used for synchronous read ECE 645 – Computer Arithmetic

  18. Each LUT can be configured as shift register Serial in, serial out Dynamically addressable delay up to 16 cycles For programmable pipeline Cascade for greater cycle delays Use CLB flip-flops to add depth Shift Register LUT D D D D Q Q Q Q IN CE CE CE CE CE CLK LUT = OUT DEPTH[3:0] ECE 645 – Computer Arithmetic

  19. Shift Register 12 Cycles 64 64 Operation A Operation B 4 Cycles 8 Cycles Operation C 3 Cycles 3 Cycles 9-Cycle imbalance • Register-rich FPGA • Allows for addition of pipeline stages to increase throughput • Data paths must be balanced to keep desired functionality ECE 645 – Computer Arithmetic

  20. Carry & Control Logic COUT YB Carry & Control Logic Look-Up Table Y G4 G3 G2 G1 S D Q O CK EC R F5IN BY SR XB Look-Up Table Carry & Control Logic X S F4 F3 F2 F1 D Q O CK EC R CIN CLK CE SLICE ECE 645 – Computer Arithmetic

  21. Fast Carry Logic • Each CLB contains separate logic and routing for the fast generation of sum & carry signals • Increases efficiency and performance of adders, subtractors, accumulators, comparators, and counters • Carry logic is independent of normal logic and routing resources MSB Carry Logic Routing LSB ECE 645 – Computer Arithmetic

  22. Accessing Carry Logic • All major synthesis tools can infer carry logic for arithmetic functions • Addition (SUM <= A + B) • Subtraction (DIFF <= A - B) • Comparators (if A < B then…) • Counters (count <= count +1) ECE 645 – Computer Arithmetic

  23. Block RAM Port A Spartan-II True Dual-Port Block RAM Port B Block RAM • Most efficient memory implementation • Dedicated blocks of memory • Ideal for most memory requirements • 4 to 104 memory blocks • 18 kbits = 18,432 bits per block • Use multiple blocks for larger memories • Builds both single and true dual-port RAMs ECE 645 – Computer Arithmetic

  24. Spartan-3 Block RAM Amounts ECE 645 – Computer Arithmetic

  25. Block RAM Port Aspect Ratios ECE 645 – Computer Arithmetic

  26. Block RAM Port Aspect Ratios 1 2 4 0 0 0 4k x 4 8k x 2 4,095 16k x 1 8,191 8+1 0 2k x (8+1) 2047 16+2 0 1024 x (16+2) 1023 16,383 ECE 645 – Computer Arithmetic

  27. Dual Port Block RAM ECE 645 – Computer Arithmetic

  28. Dual-Port Bus Flexibility RAMB4_S4_S16 • Each port can be configured with a different data bus width • Provides easy data width conversion without any additional logic WEA Port A In 1K-Bit Depth Port A Out 18-Bit Width ENA RSTA DOA[17:0] CLKA ADDRA[9:0] DIA[17:0] WEB Port B Out 9-Bit Width Port B In 2k-Bit Depth ENB RSTB DOB[8:0] CLKB ADDRB[8:0] DIB[15:0] ECE 645 – Computer Arithmetic

  29. Added advantage of True Dual-Port No wasted RAM Bits Can split a Dual-Port 16K RAM into two Single-Port 8K RAM Simultaneous independent access to each RAM To access the lower RAM Tie the MSB address bit to Logic Low To access the upper RAM Tie the MSB address bit to Logic High Two Independent Single-Port RAMs DOA[0] WEA ENA RSTA CLKA ADDRA[12:0] DOB[0] DIA[0] WEB ENB RSTB CLKB ADDRB[12:0] DIB[0] RAMB4_S1_S1 Port A In 8K-Bit Depth Port A Out 1-Bit Width VCC, ADDR[12:0] Port B In 8K-Bit Depth Port B Out 1-Bit Width GND, ADDR[12:0] ECE 645 – Computer Arithmetic

  30. New 18 x 18 Embedded Multiplier • Fast arithmetic functions • Optimized to implement multiply / accumulate modules ECE 645 – Computer Arithmetic

  31. 18 x 18 Multiplier 18 x 18 Multiplier Data_A (18 bits) Output (36 bits) Data_B (18 bits) • Embedded 18-bit x 18-bit multiplier • 2’s complement signed operation • Multipliers are organized in columns Note: See Virtex-II Data Sheet for updated performances ECE 645 – Computer Arithmetic

  32. Basic I/O Block Structure Q D Three-State EC FF Enable Three-StateControl Clock SR Set/Reset Q D Output EC FF Enable Output Path SR Direct Input FF Enable Input Path Q D Registered Input EC SR ECE 645 – Computer Arithmetic

  33. IOB Functionality • IOB provides interface between the package pins and CLBs • Each IOB can work as uni- or bi-directional I/O • Outputs can be forced into High Impedance • Inputs and outputs can be registered • advised for high-performance I/O • Inputs can be delayed ECE 645 – Computer Arithmetic

  34. Routing Resources CLB CLB CLB PSM PSM Programmable Switch Matrix CLB CLB CLB PSM PSM CLB CLB CLB ECE 645 – Computer Arithmetic

  35. Clock Distribution ECE 645 – Computer Arithmetic

  36. Spartan-3 FPGA Family Members ECE 645 – Computer Arithmetic

  37. FPGA Nomenclature ECE 645 – Computer Arithmetic

  38. Device Part Marking We’re Using: XC3S100-4FG256 ECE 645 – Computer Arithmetic

  39. ECE 645 – Computer Arithmetic

  40. Virtex-II 1.5V Architecture Block RAMs Block RAMs Block RAMs Block RAMs I/O Block Configurable Logic Block Multipliers 18 x 18 Multipliers 18 x 18 Multipliers 18 x 18 Multipliers 18 x 18 ECE 645 – Computer Arithmetic

  41. Virtex-II 1.5V ECE 645 – Computer Arithmetic

  42. Virtex-II Block SelectRAM • Virtex-II BRAM is 18 kbits • Additional “parity” bits available in selected configurations ECE 645 – Computer Arithmetic

  43. Using Library Components in VHDL Code ECE 645 – Computer Arithmetic

  44. RAM 16x1 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity RAM_16X1_DISTRIBUTED is port( CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC; DATA_OUT : out STD_LOGIC ); end RAM_16X1_DISTRIBUTED; ECE 645 – Computer Arithmetic

  45. RAM 16x1 (2) architecture RAM_16X1_DISTRIBUTED_STRUCTURAL of RAM_16X1_DISTRIBUTED is attribute INIT : string; attribute INIT of RAM16X1_S_1: label is "F0C1"; -- Component declaration of the "ram16x1s(ram16x1s_v)" unit -- File name contains "ram16x1s" entity: ./src/unisim_vital.vhd component ram16x1s generic( INIT : BIT_VECTOR(15 downto 0) := X"0000"); port( O : out std_ulogic; A0 : in std_ulogic; A1 : in std_ulogic; A2 : in std_ulogic; A3 : in std_ulogic; D : in std_ulogic; WCLK : in std_ulogic; WE : in std_ulogic); end component; ECE 645 – Computer Arithmetic

  46. RAM 16x1 (3) begin RAM_16X1_S_1: ram16x1s generic map (INIT => X"F0C1") port map (O=>DATA_OUT, A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>DATA_IN, WCLK=>CLK, WE=>WE ); end RAM_16X1_DISTRIBUTED_STRUCTURAL; ECE 645 – Computer Arithmetic

  47. RAM 16x8 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity RAM_16X8_DISTRIBUTED is port( CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC_VECTOR(7 downto 0); DATA_OUT : out STD_LOGIC_VECTOR(7 downto 0) ); end RAM_16X8_DISTRIBUTED; ECE 645 – Computer Arithmetic

  48. RAM 16x8 (2) architecture RAM_16X8_DISTRIBUTED_STRUCTURAL of RAM_16X8_DISTRIBUTED is attribute INIT : string; attribute INIT of RAM16X1_S_1: label is "0000"; -- Component declaration of the "ram16x1s(ram16x1s_v)" unit -- File name contains "ram16x1s" entity: ./src/unisim_vital.vhd component ram16x1s generic( INIT : BIT_VECTOR(15 downto 0) := X"0000"); port( O : out std_ulogic; A0 : in std_ulogic; A1 : in std_ulogic; A2 : in std_ulogic; A3 : in std_ulogic; D : in std_ulogic; WCLK : in std_ulogic; WE : in std_ulogic); end component; ECE 645 – Computer Arithmetic

  49. RAM 16x8 (3) begin GENERATE_MEMORY: for I in 0 to 7 generate RAM_16X1_S_1: ram16x1s generic map (INIT => X"0000") port map (O=>DATA_OUT(I), A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>DATA_IN(I), WCLK=>CLK, WE=>WE ); end generate; end RAM_16X8_DISTRIBUTED_STRUCTURAL; ECE 645 – Computer Arithmetic

  50. ROM 16x1 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity ROM_16X1_DISTRIBUTED is port( ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_OUT : out STD_LOGIC ); end ROM_16X1_DISTRIBUTED; ECE 645 – Computer Arithmetic

More Related