1 / 35

Advanced Topics on FPGA Applications Screen A

Advanced Topics on FPGA Applications Screen A. Wu, Jinyuan Fermilab IEEE NSS 2007 Refresher Course Supplemental Materials Oct, 2007. Doublet Matching, Hash Sorter. Hit Matching. Hash Sorter. Pass 1: Data in Group 1 are stored in the hash sorter bins based on key number K. Pass 2:

denis
Download Presentation

Advanced Topics on FPGA Applications Screen A

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Topics on FPGA ApplicationsScreen A Wu, Jinyuan Fermilab IEEE NSS 2007 Refresher Course Supplemental Materials Oct, 2007

  2. Doublet Matching,Hash Sorter IEEE NSS Refresher Course, Supplemental Materials

  3. Hit Matching IEEE NSS Refresher Course, Supplemental Materials

  4. Hash Sorter • Pass 1: • Data in Group 1 are stored in the hash sorter bins based on key number K. • Pass 2: • Data in Group 2 are fetched though and paired up with corresponding Group 1 data with same key number K. K D Group 1 K Group 2 K D IEEE NSS Refresher Course, Supplemental Materials

  5. Hash Sorter K IEEE NSS Refresher Course, Supplemental Materials

  6. Hash Sorter Implementation Single clock cycle fast reset Pipelined structure: Single clock cycle push or pop IEEE NSS Refresher Course, Supplemental Materials

  7. An Example of Track Recognition: Event • We explain the track recognition process using this 20-track example. IEEE NSS Refresher Course, Supplemental Materials

  8. There are various techniques to measure the tangent angle of the track segment (or “doublet”, or “cluster”). Sometimes extra “ghost” segments may exist. The ghost segments may be resolved in track recognition process later. Tangent Angle Measurements a IEEE NSS Refresher Course, Supplemental Materials

  9. A soft track hits large f region. A global algorithm is better suited. The “high-pT” approximation is not valid globally. Exact track equation is needed. a A Large Curvature Track R r Parameter: Radius of curvature Measure the tangent angle.. f a0 Parameter: Initial angle IEEE NSS Refresher Course, Supplemental Materials

  10. An Example of Track Recognition: Clustering For doublets on the seeding super layer in this bin… The 9-bin scheme The 4-bin scheme For doublets on the seeding super layer in this bin… search for coincident in these 9 bins. search for coincident in these 4 bins. The doublets in clusters are grouped together. clustering c0 The “ghost” doublets are gone. a0 IEEE NSS Refresher Course, Supplemental Materials

  11. FPGA Block Diagram Hash sorters for a0 Hash sorters for c0 IEEE NSS Refresher Course, Supplemental Materials

  12. Without Full Track Recognition • Two track parameters can be calculated for each doublet. • Useful trigger primitives can be found without full track recognition. • For example… IEEE NSS Refresher Course, Supplemental Materials

  13. Triplet Finding,Tiny Triplet Finder IEEE NSS Refresher Course, Supplemental Materials

  14. Hit Matching IEEE NSS Refresher Course, Supplemental Materials

  15. Hits, Hit Data & Triplets • Hit data come out of the detector planes in random order. • Hit data from 3 planes generated by same particle tracks are organized together to form triplets. IEEE NSS Refresher Course, Supplemental Materials

  16. TTF OperationsPhase I: Filling Bit Arrays Bit Array/Shifters Note: Flipped Bit Order • xA+ xC = 2 xB • xA= - xC + constant Physical Planes Fill a corresponding logic cell. For any hit… IEEE NSS Refresher Course, Supplemental Materials

  17. TTF Operations Phase II: Making Match Bit Array/Shifters Triplet is found. Logically shift the bit array. Perform bit-wise AND in this range. Physical Planes For any center plane hit… IEEE NSS Refresher Course, Supplemental Materials

  18. Tiny Triplet FinderReuse Coincident Logic via Shifting Hit Patterns C3 C2 C1 One set of coincident logic is implemented. For an arbitrary hit on C3, rotate, i.e., shift the hit patterns for C1 and C2 to search for coincidence. IEEE NSS Refresher Course, Supplemental Materials

  19. Tiny Triplet Finder for Circular Tracks Also works with more than 3 layers Shifter Shifter Bit-wise Coincident Logic Bit Array Bit Array • Fill the C1 and C2 bit arrays. (n1 clock cycles) • Loop over C3 hits, shift bit arrays and check for coincidence. (n3 clock cycles) *R1/R3 *R2/R3 Triplet Map Output To Decoder IEEE NSS Refresher Course, Supplemental Materials

  20. Tiny? Yes, Tiny! – Logic Cell Usage: AM, CAM, Hough Transform etc., O(N2) Tiny Triplet Finder O(N*logN) IEEE NSS Refresher Course, Supplemental Materials

  21. Complex Triplet Fining Problems IEEE NSS Refresher Course, Supplemental Materials

  22. Options of Sequence Control IEEE NSS Refresher Course, Supplemental Materials

  23. Micro-computing vs. Reconfigurable Computing (100+3-4)*5+7 =? 100 3 Data: 100,3,4,5,7 4 5 7 Control: LD (+) (-) (*) (+) Data FPGA Data CPU Program Program Configuration • In microprocessor, the users specify program on fixed logic circuits. • In FPGA, the users specify logic circuits (as well as program). • The FPGA computing needs not to follow microprocessor architectures. (But useful experiences can be borrowed.) • The usefulness of FPGA reconfigurable computing is still to be fully appreciated. IEEE NSS Refresher Course, Supplemental Materials

  24. Conditional Branch Logic Reset A Program Counter ROM 128x 36bits Control Signals CLK Loop & Return Logic + Stack ELMS– Enclosed Loop Micro-Sequencer Allows jump back as in microprocessors Special in ELMS Supports FOR loops at machine code level • PC+ROM is a good sequencer in FPGA. • Adding Conditional Branch Logic allows the program to loop back. • Loop & Return Logic + Stack is a special feature in ELMS that supports FOR loops at machine code level. PC Control Signals Opration 00 000000000000000 01 001000100011010 LD R1, #n 02 000010001000000 LD R2, #addr_a 03 000000000000100 LD R3, #addr_X 04 000000010001000 LD R7, #0 05 000000000100001 BckA1 LD R4, (R2) 06 000100000010000 INC R2 07 000001000100000 LD R5, (R3) 08 000100010000001 INC R3 09 001001000100000 MUL R6, R4, R5 0a 000000010001000 EndA1 ADD R7, R7, R6 0b 000010000010000 DEC R1 0c 000000100000100 BRNZ BckA1 IEEE NSS Refresher Course, Supplemental Materials

  25. Software: Using Spread Sheet as Compiler IEEE NSS Refresher Course, Supplemental Materials

  26. What’s Good about ELMSNo ALU => Small Resource Usage Princeton Architecture Harvard Architecture Fermilab Architecture(?) Program DATA Memory Program Control Program Memory Program Control Program Memory Sequencer (ELMS) ALU ALU DATA Memory DATA Memory Data Processor • The Princeton Architecture is more suitable at system level while Harvard Architecture is better suited at micro-structure level. • Regular microprocessors cannot run looped program without an ALU. • The ALU takes large amount of resource while may not be efficiently utilized for data processing tasks in FPGA. • The ELMS can run nested loop program without an ALU. • Further separation of Program and data is therefore possible. • The ELMS is kept small. IEEE NSS Refresher Course, Supplemental Materials

  27. Recursive Structure IEEE NSS Refresher Course, Supplemental Materials

  28. RAM Ion Chamber Input CIC Sums De-ripple Process ADC 21ms/sample Immediate Sliding Sum A>B Threshold I Fast Sliding Sum A>B Threshold F Slow Sliding Sum A>B Threshold S Very Slow Sliding Sum A>B Threshold V Abort Logic Seq128 The Digitizer Card for the Fermilab Beam Loss Monitor System • Beam loss input signals from ion chambers are integrated and digitized. • Sliding sums are accumulated and compared with pre-loaded thresholds. • Over threshold in several places causes beam abort based on pre-defined setting. • Beam loss signals are filtered and “de-rippled” for display purposes. • Sequence is controlled by “Seq128” block. IEEE NSS Refresher Course, Supplemental Materials

  29. Filter Functions 21ms/sample 124 samples Sliding Sum Cascaded Integrator Comb (CIC) Sum of 2nd Order First Zero @ 360 Hz • The CIC sum is a sliding sum of sliding sums. • The frequency response of CIC sum is a sinc2(x) function that has 2nd order zeros and better stop band suppression. Frequency IEEE NSS Refresher Course, Supplemental Materials

  30. x[n] x[n] -x[n-K] S + s[n] s[n] Filter Implementation Recursive != IIR Finite Impulse Respond (FIR) Infinite Impulse Respond (IIR) Non-Recursive Implementation Yes NO Resource Friendly Recursive Implementation Possible Yes Sliding Sum The non-recursive implementation needs: • 124 memory fetches, • 124 additions and • more ops for longer sum lengths. The recursive implementation needs: • 1 memory fetch, • 2 add/sub operations • regardless sum length. IEEE NSS Refresher Course, Supplemental Materials

  31. + Sliding Sum 4 Sliding Sum 2 Sliding Sum 3 (-) + Sliding Sum 1 (-) x[n] x[n-L] -2x[n-K] -2x[n-L-K] + + x[n-2K] x[n-L-2K] u[n] u[n-L] + + y[n] MaxDY y[n-L] Decimation Counter If |y[n]-y[n-L]|>MaxDY for entire period, then PG++. WF PG=0 S WF PG=1 S PG - - WF-WM DR=y[n]-(WF-WM) - BLM DC Process Sequencing Fully Sequencing Partially Flat • The processes of calculating sliding sums and CIC sums are fully sequenced. • The de-ripple processor is flat for the process path. But it operates sequentially for 4 channels. IEEE NSS Refresher Course, Supplemental Materials

  32. The EndThanks IEEE NSS Refresher Course, Supplemental Materials

  33. Resource Saving Tricks Loop Reduction Tricks: The number of computations in a given task is reduced by (1) using fewer iterations in loops or/and (2) using fewer operations in each iteration. Non-Loop Reduction Tricks: The number of computations in a given task is unchanged. The FPGA resource is saved by (1) reusing the resources multiple times via sequencing or/and (2) using transistor-saving resources such as RAM. IEEE NSS Refresher Course, Supplemental Materials

  34. x[n] Shifter Bit-wise Coincident Logic Shifter -x[n-K] + Bit Array s[n] Bit Array x[n] -s[n-K] + y[n] *h1 *h2 S *h[K] *R1/R3 *R2/R3 y[n] X << S +/- S Resource Saving TricksLoop-Reduction Tiny Triplet Finder: O(n)*O(N*log(N)) Recursive Implementation of FIR Filter Multiplier-less (ML) Approaches FFT: O(n)*O(log(N)) IEEE NSS Refresher Course, Supplemental Materials

  35. Resource Saving TricksNon-Loop-Reduction Sequencing: Using RAM: Hash Sorter/Histogram Initialization Initialization 1 Initialization 2 Initialization 3 OP4 OP3 OP2 OP1 OP2 OP3 OP4 OP1 OP4 OP3 OP2 OP1 OP2 OP3 OP4 OP1 OP4 OP3 OP2 OP1 OP2 OP3 OP4 OP1 OP4 OP3 OP2 OP1 OP2 OP3 OP4 OP1 IEEE NSS Refresher Course, Supplemental Materials

More Related