1 / 13

ECE 667 Spring 2013 Synthesis and Verification of Digital Systems

ECE 667 Spring 2013 Synthesis and Verification of Digital Systems. Technology Mapping for FPGAs. D.Chen, J.Cong, DAOMap : A Depth optimal Area Optimization mapping algorithm for FPGA Designs, ICCAD 2004. Programming bit P. F = x 1 ’x 2 ’ + x 1 x 2. 2-Input LUT. x1 x2 F

alijah
Download Presentation

ECE 667 Spring 2013 Synthesis and Verification of Digital Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ECE 667Spring 2013Synthesis and Verificationof Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap : A Depth optimal Area Optimization mapping algorithm for FPGA Designs, ICCAD 2004

  2. Programming bit P F = x1’x2’ + x1x2 2-Input LUT x1 x2 F 0 0 1 0 1 0 1 0 0 1 1 1 0/1 F 0/1 0/1 0/1 x1 x2 FPGA Mapping (LUT-based) • How is it different from ASIC (standard cells) • Structural in nature, simpler • Any function with k inputs can be mapped into a k-LUT • Typically implemented by cut mapping • FPGA architecture: k-LUT ECE 667 Synthesis & Verificatioin - FPGA Mapping

  3. f g d e h b a c FPGA Mapping - example A possible mapping onto 3-LUTs - each block has  inputs ECE 667 Synthesis & Verificatioin - FPGA Mapping

  4. Fv 3-feasible cone Cv Delay of 2 Definitions • DAG: Boolean network • Cone Cv: sub-network rooted on node v • K-feasible cone: |input(Cv)|  K • Fanin Cone Fv: the largest Cv • k-feasible cut: a k-feasible Cv • Unit delay model: • Each LUT contributes one unit delay • Cut rooted on node C: cut with output C PIs a c b d e v ECE 667 Synthesis & Verificatioin - FPGA Mapping

  5. Problem Formulation • Delay-optimal Area Optimization problem • Given: a Boolean network; an integer k (LUT size) • Goal: cover the network with k-feasible cones (k-LUTs), such that • Mapping depth (delay) is minimum • Area (number of LUTs) is minimized • NP-hard problem on area minimization • A two-step process • Cut enumeration + evaluation (delay, area) • Cut selection to minimize delay • Possible iteration to remap nodes on non-critical paths (area recovery) • Takes into consideration node duplication ECE 667 Synthesis & Verificatioin - FPGA Mapping

  6. w z x y c a b Subcut d Subcut Another Subcut New cut Cut Enumeration w z x y c a b d • Process nodes in topological order from PIs to POs • Combine sub-cuts of the fanin nodes to create a new cut • If the size of the cut exceeds k (LUT size), discard the cut ECE 667 Synthesis & Verificatioin - FPGA Mapping

  7. Delay = 1 Delay = 2 Delay = 1 Delay = 1 Delay = 3 Delay = 2 Delay = 2 Delay Propagation w z x y b Optimal Delay = 1 Optimal Delay = 1 a c Optimal Delay = 1 d e g f Optimal Delay = 2 • Delay computed using dynamic programming method. • The longest best delay on the POs is the optimal mapping delay ECE 667 Synthesis & Verificatioin - FPGA Mapping

  8. As / 2 Area Estimation Tries to estimate area considering fanout effect AC =  [Ai / f(i)] + UC i = input(C) • Ai : estimated area of the fanin cone of signal i • f(i) : fanout number of inputs • Uc : area of the cut itself • Can underestimate area due to node duplication Ap p m n o f(p) = 2 q r Cut C s X u t Cut Ct Cut Cu ECE 667 Synthesis & Verificatioin - FPGA Mapping

  9. Duplication Cost Adjustment • Considers potential node duplications • Check the sub-cuts for multiple fanouts • Area adjusted by addition of duplication cost • Duplication Cost: • NCf : number of nodes contained by subcut Cf • IC : cutsize of C • fi : fanout number of subcut p m n o q r Subcut Cf2 NCf2 = 1 Subcut Cf1 s New cut C IC = 4 Multiple fanouts ECE 667 Synthesis & Verificatioin - FPGA Mapping

  10. C3 fanin1 fanin2 Cost (Area) Function of a Cut Some Key parameters • IC: cutsize of C • NC: number of nodes covered by C • f(v): fanout number of the root node v • Pf: duplication cost a C1 c b C2 d e v ECE 667 Synthesis & Verificatioin - FPGA Mapping

  11. Cut Selection • Once cuts are generated, traverse networks from POs to PIs and select cuts that map into LUTs • Select cuts such that timing is met and the area is minimized • Iterative Cut Selection Procedure • Local Cost Adjustment • Input Sharing • Slack Distribution • Cut Probing ECE 667 Synthesis & Verificatioin - FPGA Mapping

  12. Local Cost Adjustment – Slack Distribution • SlackC = Reqv – 1 – MAX (Arri) i  input(C) • If SlackC < 0, C is not a timing_feasible cut • The larger the SlackC, the better for C in terms of slack distribution effect w z x y b Largest arrival time among inputs a c C d Reqd : Required time of the root ECE 667 Synthesis & Verificatioin - FPGA Mapping

  13. Algorithm Recap • Cut generationof k- feasible cuts • Area propagation under timing constraints • optimal area at a node is the minimum area among cuts that give minimum delay • Representation of the cost function for a cut more accurately • Global duplication cost adjustment • Cut selection involving local cost adjustment ECE 667 Synthesis & Verificatioin - FPGA Mapping

More Related