1 / 26

IXP Based Router for ONL: Architecture

IXP Based Router for ONL: Architecture. John DeHart jdd@arl.wustl.edu http://www.arl.wustl.edu/arl. Overview. These slides are a start at building up some informational and design slides for the ONL IXP-based Router. Hardware. Promentum™ ATCA-7010 (NP Blade): Two Intel IXP2850 NPs

Download Presentation

IXP Based Router for ONL: Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IXP Based Routerfor ONL: Architecture John DeHartjdd@arl.wustl.edu http://www.arl.wustl.edu/arl

  2. Overview • These slides are a start at building up some informational and design slides for the ONL IXP-based Router.

  3. Hardware • Promentum™ ATCA-7010 (NP Blade): • Two Intel IXP2850 NPs • 1.4 GHz Core • 700 MHz Xscale • Each NPU has: • 3x256MB RDRAM, 533 MHz • 3 Channels • Address space is striped across all three. • 4 QDR II SRAM Channels • Channels 1, 2 and 3 populated with 8MB each running at 200 MHz • 16KB of Scratch Memory • 16 Microengines • Instruction Store: 8K 40-bit wide instructions • Local Memory: 640 32-bit words • TCAM: Network Search Engine (NSE) on SRAM channel 0 • Each NPU has a separate LA-1 Interface • Part Number: IDT75K72234 • 18Mb TCAM • Rear Transition Module (RTM) • Connects via ATCA Zone 3 • 10 1GE Physical Interfaces • Supports Fiber or Copper interfaces using SFP modules.

  4. Hardware ATCA Chassis NP Blade RTM

  5. NP Blades

  6. QM QM Router Block Diagram Control TCAM XScale XScale Rx Parse Parse Rx Lookup Lookup Tx Tx HeaderFormat HeaderFormat NPUA NPUB

  7. NP-Based Router Design 1 ME 1-2 ME TX deq enq 1 ME 1 ME 1 ME 1-2 ME Hdr Fmt Lookup Parse & Key Ext RX enq deq NN Rings 1 ME 1 ME Scratch or SRAM Rings plugin SRAM Rings plugin • My re-drawing of JST’s design • Add Parse and Key Extract Block (Parse and Key Ext) • Add Header Format Block (Hdr Fmt) • Add Plugin return path go to Parse and Key Extract instead of Lookup. • Add ME estimates • Rx and Tx may be 1 or 2 MEs each. • If we are only targeting 5 Ports then they may be 1 each. • Add designations of Next Neighbor, Scratch and SRAM Rings • Use NN along all of non-plugin path. • My understanding is that multiple ME can Read from or Write to the same Scratch or SRAM ring. Scr/SRAMring SRAMrings 1 ME

  8. Buffer Descriptor Buffer_Next Buffer_Size Offset Free_List Packet_Size MR_ID TxMI QM VLAN Packet_Next ONL Router Functional Blocks Rx Parse Lookup Hdr Format Tx • Lets look at • What data passes from block to block • What blocks touch the Buffer Descriptor

  9. Rx Parse Lookup Hdr Format Tx Buffer Descriptor Buf Handle(32b) Buffer_Next InPort(8b) Buffer_Size Offset Free_List Packet_Size MR_ID TxMI QM VLAN Packet_Next ONL Router Functional Blocks RBUF • Rx: • Function • Coordinate transfer of packets from RBUF to DRAM • Notes: • This should be almost if not exactly the same version as in the techX implementation. • We’ll pass the Buffer Handle which contains the SRAM address of the buffer descriptor. • From the SRAM address of the descriptor we can calculate the DRAM address of the buffer data.

  10. MR ID (16b) Rx Parse Lookup Hdr Format Tx MI/Port (16b) DAddr (32b) SAddr (32b) Protocol (8b) Buf Handle(32b) DPort (16b) Buf Handle(32b) Buffer Offset(16b) Sport (16b) Port(8b) Lookup Key(148b) TCP_Flags (12b) InPort(8b) QM ONL Router Functional Blocks • Parse • Function • IPv4 header processing • Generate IPv4 lookup key from packet • Notes: • This should be almost if not exactly the version we use in the IPv4 MR • Can Parse adjust the buffer/packet size and offset? • Can Parse do something like, terminate a tunnel and strip off an outer header?

  11. Rx Parse Lookup Hdr Format Tx Buf Handle(32b) Buf Handle(32b) Buffer Offset(16b) Buffer Offset(16b) Lookup Key(148b) Lookup Result(53b) InPort(8b) OutPort (8b) Output MI (16b) QM QID (20b) Priority (8b) Drop (1b) ONL Router Functional Blocks • Lookup • Function • Perform lookup in TCAM based on lookup key • Result: • Notes: • This should be almost if not exactly the version we use in the IPv4 MR • Needs to handle Primary/Secondary filters • Primary == Exclusive • Secondary == Non-exclusive • How do sample apps that do multicast handle multiple copies?

  12. Rx Parse Lookup Hdr Format Tx Buf Handle(32b) Buffer Descriptor Buffer Offset(16b) Buffer Handle(32b) Size (16b) Lookup Result(53b) Buffer_Next QID(16b) OutPort(8b) Buffer_Size Offset Free_List Packet_Size MR_ID QM TxMI VLAN Packet_Next ONL Router Functional Blocks • Header Format • Function • IPv4 packet header formatting • IPv4 Lookup Result processing • Drop and Miss bits • Extract QID and Port • Notes: • This should be almost if not exactly the version we use in the IPv4 MR

  13. Rx Parse Lookup Hdr Format Tx Buffer Descriptor Buffer_Next Buffer_Size Offset Free_List Packet_Size MR_ID QM TxMI VLAN Packet_Next ONL Router Functional Blocks Buffer Handle(32b) Buf Handle(32b) QID(16b) Size (16b) • QM • Function • queue management • Notes: • This should be almost if not exactly the same version as in the techX implementation. OutPort(8b)

  14. Rx Parse Lookup Hdr Format Tx Buffer Descriptor Buffer_Next Buffer_Size Offset Free_List Packet_Size MR_ID QM TxMI VLAN Packet_Next ONL Router Functional Blocks Buffer Handle(32b) TBUF • Tx • Function • Coordinate transfer of packets from DRAM to TBUF • Notes: • This should be almost if not exactly the same version as in the techX implementation.

  15. 13 13 13 13 13 3 3 3 3 3 JST: Network Configuration Switches GE GE GE GE GE 48 GE 48 GE 48 GE 48 GE 48 GE • No blocking possible (I think). • 20 spare ports per configuration switch • enough capacity for twice as many NPUs plus hosts • $1500/switch, so can buy 6 for $9K giving us a spare WUGS-20 NPU x10 x4

  16. 4 / 4 / 4 / WUGS-20 WUGS-20 WUGS-20 12 16 16 3 3 3 3 2 2 4 2 4 4 4 2 2 / 2 / NPU NPU JDD:Network Configuration Switches Edge 48 GE 16 / Level-1 Inter- connect 48 GE Switch #1 Edge 48 GE 8 / 8 / 16 / 8 / 2 / NPU /24 16 / Level-1 Inter- connect 48 GE Switch #2 Edge 48 GE 4 / WUGS-20 8 / 8 / 16 / 12 / GE GE 1 8 / 2 / GE NPU 1 1 1 GE

  17. 8 8 8 8 14 14 16 16 16 16 1 2 3 2 3 1 2 2 2 2 2 1 2 3 3 1 7 7 8 8 8 8 1 1 16 16 GE GE WUGS WUGS NP Bl. NP Bl. NP Bl. NP Bl. JDD:Network Configuration Switches Interconnect 48 GE Interconnect 48 GE 8 8 8 8 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE • 8 NP Blades, 4 WUGS NSPs, 8 48-pt GE Switches • 128 hosts 8 8 8 8 7 7 16 16 1 1 GE GE NP Bl. NP Bl. NP Bl. NP Bl. WUGS WUGS

  18. 8 8 8 8 14 14 16 16 8 3 1 3 2 2 1 3 3 1 2 2 2 1 7 7 8 8 8 8 1 1 16 16 GE GE WUGS WUGS NP Bl. NP Bl. NP Bl. NP Bl. Network Configuration Switches Interconnect 48 GE Interconnect 48 GE 8 8 8 8 8 8 Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE Edge 48 GE • 5 NP Blades, 4 WUGS NSPs, 7 48-pt GE Switches • 94 hosts 8 7 7 8 1 1 GE GE NP Bl. WUGS WUGS

  19. Notes on JDD: Configuration Switches • Routers would be assigned hosts from those connected to its Edge switch only. • This does limit the availability of the 3-Host GE clusters to the WUGS-20 routers only. • If we have extra hosts we might want to put some 3 Host GE clusters on the NPU side. • 1 NPU Blade can replace 2 WUGS-20 NSPs • If we need to grow beyond this configuration • we can add Level-1 Interconnection Switches 3 and 4 and then connect all 4 Level-1 Interconnection switches through a 96 port Level-2 Interconnection switch with 24 ports to each of the Level-1 Interconnection switches. • We can add a chassis based switch which can be expanded as needed.

  20. Extra • The next set of slides are for templates or extra information if needed

  21. Text Slide Template

  22. Image Slide Template

  23. OLD • The rest of these are old slides that should be deleted at some point.

  24. QID QID(20b) QID QID QID QID QID QID credit credit credit credit credit Credit(32b) credit credit Notes on Memory Usage SRAM Adr(32b) Next Ptr(32b) • 3 SRAM Channels of 8MB each per NP • 3 RDRAM Channels 0f 256MB  768 MB per NP • XScale uses the RDRAM. • There is no separate DRAM for the XScale!!! • 640 32-bit words of Local Memory per MicroEngine • Parameters: • N: Max number of packets in the system at any given time. • M: Max number of queues that need to be supported • BatchSize: Number of slots in Scheduling Data Structure Segments (8 for now) • Data Structures Stored in SRAM: • Buffer Descriptors • 32 Bytes each • Number needed: N • IXP Queue Descriptors • 16 Bytes each • Number needed: M • QM Queue Data (QLen, Weight, Threshold) • 12 Bytes each • Number needed: M • Scheduling Data Structure Segments: • BatchSize*8 + 4B (address) + 4B (pointer to next) + 1 Bytes each • Number needed: (M/8) + x • Where x is the number of extra/spare needed to operate alogorithm • X <= 10 is probably sufficient • Data stored in DRAM: • Packet Buffers • 2KB each Scheduling Data Structure

  25. Notes on Memory Usage • 1 SRAM Channel for IXP Queue Descs, QM Queue Data and Scheduling Data Structure: • 16*M + 12*M + ((M/8)+10)*(72) <= 8MB (0x800000 = 8388608) • 28*M + 9M + 720 <= 8MB (8388608) • 37*M <= 8387888 • M <= 226699 • So, lets say we will support 128K Queues (131071) • 17 bits of QID gives us a range of 0 – 1310171 • 1 SRAM Channel for Buffer Descriptors • 32*N <= 8MB (0x800000 = 8388608) • N <= 262144 (0x40000) • Max of 256K packets in the system • 1 SRAM Channel still free • On NPE this would be used for: • MR specific data region • MR configuration data • Etc. • DRAM usage for 256K Packets: • 256K * 2K per buffer = 512 MB • Out of 768MB available.

  26. Core Components (sample App) Xscale MicroEngines

More Related