1 / 33

EaseCAM : An Energy And Storage Efficient TCAM-based IP-Lookup Architecture

EaseCAM : An Energy And Storage Efficient TCAM-based IP-Lookup Architecture. V.C. Ravikumar, Rabi Mahapatra Texas A&M University; Laxmi Bhuyan University of California, Riverside. Overview. Introduction Research Goal Proposed approach Results Conclusion & Future work. Introduction.

renjiro
Download Presentation

EaseCAM : An Energy And Storage Efficient TCAM-based IP-Lookup Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EaseCAM: An Energy And Storage Efficient TCAM-based IP-Lookup Architecture V.C. Ravikumar, Rabi Mahapatra Texas A&M University; Laxmi Bhuyan University of California, Riverside

  2. Overview • Introduction • Research Goal • Proposed approach • Results • Conclusion & Future work

  3. Introduction Header Processing Data Hdr Data Hdr IPLookup Packet Queue IP Address Next Hop Routing Table DRAM

  4. Introduction • HW and SW solutions for IP lookup • Software solutions unable to match link speed. • Hardware solutions can accommodate today’s link speeds • TCAMs most popular hardware device • Consume up to 15 W/chip, (4-8 chips) • Increased cooling costs and fewer ports

  5. Current Approach • Power Reduction in TCAM • Partitioning of TCAM Array [Infocom’03, Hot Interconnect’02] • Compaction (minimization) [Micro’02] • Update techniques [Micro’02] • Routing update • TCAM updates

  6. Bottleneck with existing approaches • Power reduction • Number of entries enabled is not bounded • Does not avoid storing redundant information • Update • Minimization techniques are not incremental • Update time is not independent of routing table size

  7. Motivation • Solution for bounded and reduced power consumption • Truly incremental Routing and TCAM update

  8. Contributions • A pipelined architecture for IP Lookup • New prefix properties (prefix aggregation and prefix expansion) • Upper bound on number of entries enabled (256 x 3) • Novel Page filling, memory management and incremental update techniques

  9. Solution: Prefix properties • Prefix Aggregation 128.194.1.1/32 128.194.1.2/32 128.194.1.8/30 128.194.1.16/28 128.194.1.0/24 • 128.194.1.0/24 is the LCS for the given set of prefixes (rounded to nearest octet) • Prefixes aggregated based on LCS mostly have the same next hop • Gives a bound on the number of prefixes minimized (256)

  10. Solution: Prefix properties TABLE I. Comparision of prefix compaction using prefix aggregation property and Espresso II for attcanada and bbnplanet router

  11. Solution: Prefix Properties • Prefix expansion • Prefixes having same length can be minimized • To increase minimization, extend prefixes of different length to nearest octet by adding don’t-cares • Extending to nearest octet useful for incremental update 100101XX 1011011X 1011111X 1011XXXX 100101 1011011 1011111 1011

  12. Solution: Prefix properties • Overlapping prefixes • Prefix length < 8 not present in routing table • Number of matching prefixes for IP address is ≤ 25 • Property is used to selectively enable bounded number of entries in TCAM, (256 x 3)

  13. 24 bits 1.x • . • . • . • . • . • . 2.x W1=8bits 1 2 127 128 254 255 127.x Variable Sized Segment 128.x 1st Level 254.x 255.x 2nd Level Solution: Architecture • 2 level architecture, w1 bits in 1st level and 32-w1 in 2nd level • Segment size corresponding to 1st w(8) bits is variable • Power bounded by segment size Segmented Architecture for routing lookup using TCAM.

  14. Solution: Architecture • Memory Compaction • Apply prefix properties to remove redundancies • Apply pruning, prefix aggregation and minimization in succession • Put all prefixes < w1 into bucket (Rarely occurring prefixes) Total number of entries after compaction

  15. Solution: Architecture • Paged TCAM architecture • Group the prefixes of length > w1 based on their LCS • The LCS values (cubes) that coverthe prefixes • The cubes now correspond to the page id • Prefixes covered by cube are stored in actual pages (Pages formed using LCS as page-id can result in under-utilization)

  16. ( 32 - w ) bits 1 IP address 32 bits Enable Line Page Table 1 . . . . . . IP address . Page . . I . 32 bits . . Comparator Page Page b I+1 Table I . . . . . . . Page . . I+C . . max . . . . . . . . . . . . Page Table N IP address 32 bits . a Bucket ( N * ) . . . . . Architecture Block Diagram Pages formed using LCS as page-id can result in under-utilization)

  17. 101 10* 100 How to avoid Under-utilization? • LCS aggregation • Aggregate prefixes having different LCS by modifying the cube • Set page-size to optimal value – avoid too large and small pages Observe: The maximum size of page can be 256, based on the above property

  18. Solution: Page Filling Algorithm • Page Filling Heuristics (2) • Generates cubes such that it covers maximum prefixes and page size < 256 • Aggregate the page ID’s in the page tables and store them in comparators for a 0th level lookup • Find the total memory consumed (pages, page tables and comparator) for different values of w1 • Get optimal value of w1 and page size β for which total memory is the least

  19. Solution: Page Filling • Page filling heuristics ensures: • No page has more than β*γ entries, where γ is the page fill-factor • Number of cubes that cover all the prefixes are minimum • Total memory consumption is the least for a specific value of w1 and β

  20. ( 32 - w ) bits 1 IP address 32 bits Enable Line Page Table 1 . . . . . . IP address . Page . . I . 32 bits . . Comparator Page Page b I+1 Table I . . . . . . . Page . . I+C . . max . . . . . . . . . . . . Page Table N IP address 32 bits . a Bucket ( N * ) . . . . . Architecture Block Diagram Power Enabled blocks in EaseCAM

  21. Solution: Architecture • Bucket • Prefixes of size < w1 are stored in bucket • Word length of bucket is 32 • Either bucket or pages are searched during each lookup in the 2nd level

  22. Solution: Architecture • Empirical model for memory • α: fraction of total entries in the bucket • αf: bucket fill factor • γ: page fill factor • Cmax: number of page ids in the page table • N: the number of entries • Pagemax: total number of pages • βw1: represents the optimal page size • Mimimum memory requirement • = βw1* Pagemax * (32-w1)/32 + Pagemax + Pagemax/Cmax + N*α/ αf

  23. Incremental Updates • 100s updates/sec and 10 updates/sec after routing flaps • Insertion • If length of prefix > w1, • Minimize the prefix and find the new cube • Number of prefixes minimized < 256 • Update the page table and comparator if required • Update the TCAM with changed entries • TCAM insertion time and minimization time is time bounded

  24. Solution: Incremental Update • Deletion • Delete the prefix from TCAM • Update the page table entry and comparator if required • Total number of prefixes minimized < 256 • TCAM update time is also bounded

  25. Solution: Incremental Update Comparision of incremental update time

  26. Solution: Memory Management • Managing page overflow • Reason: Lower value of γ. • Pages with same cube are recomputed • Free pages available in TCAM are used • Comparators are also updated when required

  27. Results • Power consumption per lookup bbnplanet router attcanada router

  28. Results • Case study • Memory requirements (γ=1 and α=1) Reduction in memory requirements

  29. Results: Access time • Pre-estimation using Cacti 3.0 on CAM structure Reduction in access time

  30. Results: Power • Pre-estimation using Cacti 3.0 on CAM structure Reduction in power

  31. Conclusion • Significant reduction in memory consumption based on prefix compaction • Pipelined architecture to store prefixes to achieve bounded power consumption • Efficient memory management and incremental update techniques

  32. Future work • Apply Cacti model to TCAM structure • Identify/design low-power TCAM cell • Consider classification together with IP-lookup • Fast on-chip logic minimization • Explore parallel architectures & algorithms for IP processing.

  33. Thank You !! Questions?

More Related