1 / 36

The Power of Communication: Energy-Efficient NoCs for FPGAs

The Power of Communication: Energy-Efficient NoCs for FPGAs. Mohamed ABDELFATTAH Vaughn BETZ. Outline. 1. Why NoCs on FPGAs?. 2. Embedded NoCs. 3. Power Analysis. 1. Why NoCs on FPGAs?. Motivation. Logic Blocks. Switch Blocks. Wires. Interconnect. 1. Why NoCs on FPGAs?.

base
Download Presentation

The Power of Communication: Energy-Efficient NoCs for FPGAs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Power of Communication: Energy-Efficient NoCs for FPGAs Mohamed ABDELFATTAH Vaughn BETZ

  2. Outline 1 Why NoCs on FPGAs? 2 Embedded NoCs 3 Power Analysis

  3. 1. Why NoCs on FPGAs? Motivation Logic Blocks Switch Blocks Wires Interconnect

  4. 1. Why NoCs on FPGAs? Motivation Logic Blocks Switch Blocks • Hard Blocks: • Memory • Multiplier • Processor Wires

  5. 1. Why NoCs on FPGAs? Motivation 1600 MHz Hard Interfaces DDR/PCIe .. Logic Blocks 800 MHz Switch Blocks Interconnect still the same • Hard Blocks: • Memory • Multiplier • Processor Wires 200 MHz

  6. 1. Why NoCs on FPGAs? Motivation 1600 MHz Problems: • Bandwidth requirements for hard logic/interfaces • Timing closure DDR3 PHY and Controller PCIe Controller 800 MHz 200 MHz Gigabit Ethernet

  7. 1. Why NoCs on FPGAs? Motivation Problems: • Bandwidth requirements for hard logic/interfaces • Timing closure • High interconnect utilization: • Huge CAD Problem • Slow compilation • Power/area utilization • Wire speed not scaling: • Delay is interconnect-dominated DDR3 PHY and Controller PCIe Controller Gigabit Ethernet

  8. Source: Google Earth Los Angeles Barcelona Keep the “roads”, but add “freeways”. Logic Cluster Hard Blocks

  9. 1. Why NoCs on FPGAs? FPGA with NoC NoC Problems: • Bandwidth requirements for hard logic/interfaces • Timing closure • High interconnect utilization: • Huge CAD Problem • Slow compilation • Power/area utilization • Wire speed not scaling: • Delay is interconnect-dominated DDR3 PHY and Controller Router forwards data packet PCIe Controller Links Router moves data to local interconnect Routers Gigabit Ethernet

  10. 1. Why NoCs on FPGAs? FPGA with NoC Problems: • Bandwidth requirements for hard logic/interfaces • Timing closure • High interconnect utilization: • Huge CAD Problem • Slow compilation • Power/area utilization • Wire speed not scaling: • Delay is interconnect-dominated • Abstraction favours modularity: • Parallel compilation • Partial reconfiguration • Multi-chip interconnect DDR3 PHY and Controller PCIe Controller • High bandwidth endpoints known • Pre-design NoC to requirements Gigabit Ethernet • NoC links are “re-usable” • NoC is heavily “pipelined” • NoC abstraction favors modularity

  11. 1. Why NoCs on FPGAs? FPGA with NoC Problems: • Bandwidth requirements for hard logic/interfaces • Timing closure • High interconnect utilization: • Huge CAD Problem • Slow compilation • Power/area utilization • Wire speed not scaling: • Delay is interconnect-dominated • Abstraction favours modularity: • Parallel compilation • Partial reconfiguration • Multi-chip interconnect DDR3 PHY and Controller NoCs can simplify FPGA design PCIe Controller Previous work: Compelling area efficiency and performance Does the NoC abstraction come at a high power cost? Gigabit Ethernet • Latency-tolerant communication • NoC abstraction favors modularity

  12. Outline 1 Why NoCs on FPGAs? 2 Embedded NoCs Mixed NoCs Hard NoCs 3 Power Analysis

  13. 2. Embedded NoCs Embedded NoCs = + “Soft” NoC Soft Routers Soft Links = + “Mixed” NoC Hard Routers Soft Links = + “Hard” NoC Hard Routers Hard Links

  14. Methodology Soft Mixed Hard FPGA CAD Tools ASIC CAD Tools Area Speed Design Compiler Power? Power HSPICE Gate-level simulation Gate-level simulation Toggle rates

  15. 2. Embedded NoCs Mixed NoCs Logic blocks FPGA Programmable “soft” interconnect Router Baseline Router = + “Mixed” NoC Hard Routers Soft Links

  16. 2. Embedded NoCs Mixed NoCs FPGA Router = + “Mixed” NoC Hard Routers Soft Links 16

  17. 2. Embedded NoCs Mixed NoCs FPGA Router Special Feature Configurable topology Assumed a mesh  Can form any topology

  18. 2. Embedded NoCs Hard NoCs Logic blocks FPGA Programmable “soft” interconnect Dedicated “hard” interconnect Router = + “Hard” NoC Hard Routers Hard Links 18

  19. 2. Embedded NoCs Hard NoCs FPGA Router = + “Hard” NoC Hard Routers Hard Links 19

  20. 2. Embedded NoCs Hard NoCs 1.1 V 0.9 V FPGA Router Special Feature Low-V mode Save 33% Dynamic Power ~15% slower = + “Hard” NoC Hard Routers Hard Links 20

  21. Outline 1 Why NoCs on FPGAs? 2 Embedded NoCs 3 Power Analysis Components Analysis System Analysis

  22. Soft, Mixed and Hard Hard (Low-V) Mixed Soft Area Gap 20X – 23X smaller 1X Average Speed Gap 5X – 6X faster 11X (15X) 9X Power Gap Area ~ 1.5% of FPGA 33% of FPGA Speed 730 – 940 MHz 166 MHz 64 – NoC ~ 50 GB/s Speed ~ 10 GB/s Bisection BW Investigate BW and power together 2. NoC power budget 3. Comparison 1. Power-aware design

  23. 3. Power Analysis Power-Aware NoC Design • Total BW = 250 GBps Most Efficient NoC? Wider Links, Fewer Routers Links Power Routers Power

  24. 3. Power Analysis Power-Aware NoC Design • Total BW = 250 GBps Most Efficient NoC?

  25. 3. Power Analysis Power-Aware NoC Design • Total BW = 250 GBps Most Efficient NoC?

  26. 3. Power Analysis NoC Power Budget 250 GB/s total bandwidth 123% How much is used for system-level communication? 17.4 W Typical FPGA Dynamic Power

  27. 3. Power Analysis NoC Power Budget 250 GB/s total bandwidth 123% 15% NoC 17.4 W Typical FPGA Dynamic Power

  28. 3. Power Analysis NoC Power Budget 250 GB/s total bandwidth 11% 123% 15% NoC 17.4 W Typical FPGA Dynamic Power

  29. 3. Power Analysis NoC Power Budget 250 GB/s total bandwidth 7% 11% 123% 15% NoC 17.4 W Typical FPGA Dynamic Power

  30. 3. Power Analysis Bandwidth in Perspective DDR3  Module 1 PCIe Module 2 14.6 GB/s Full theoretical BW 14.6 GB/s Cross whole chip! 17 GB/s 17 GB/s 17 GB/s 17 GB/s 14.6 GB/s Aggregate Bandwidth 126 GB/s 14.6 GB/s NoC Power Budget 3.5%

  31. 3. Power Analysis FPGA Interconnect Interconnect = Wires + Logic Interconnect = Just wires Interconnect = NoC Point-to-point Links 1 1 Multiple Masters 1 .. .. 1 .. Compare “wires” interconnect to NoCs Mux + Arbiter 1 .. .. .. .. Broadcast n .. .. .. .. 1 1 .. n .. .. Multiple Masters, Multiple Slaves n Mux + Arbiter 1 1 Mux + Arbiter n n

  32. 3. Power Analysis NoC Power vs. FPGA Interconnect High Performance / Packet Switched Length of 1 NoC Link 1 % area overhead on Stratix 5 200 MHz Runs at 730-943 MHz Power on-par with simplest FPGA interconnect Hard and Mixed NoCs very compelling

  33. Why NoCs on FPGAs? 1 Big city needs freeways to handle traffic Embedded NoCs: Mixed & Hard 2 Power: 9-15X Area: 20-23X Speed: 5-6X Power Analysis 3 • Power-aware design of embedded NoCs • Power Budget for 100 GB/s: 3-7% • Point-to-point soft Links: 4.7 mJ/GB • Embedded NoCs: 4.5 – 10.4 mJ/GB

  34. eecg.utoronto.ca/~mohamed/noc_designer.html

  35. Thank You! eecg.utoronto.ca/~mohamed/noc_designer.html

More Related