1 / 32

FPGA Based String Matching for Network Processing Applications

FPGA Based String Matching for Network Processing Applications. Janardhan Singaraju , John A. Chandy. ENGG*3050 RCS Winter 2014 March 24, 2014. Presented by: Justin Riseborough Albert Tirtariyadi. Content. Introduction String Lookup Cache Architectures System Interaction

Download Presentation

FPGA Based String Matching for Network Processing Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FPGA Based String Matching for Network Processing Applications JanardhanSingaraju, John A. Chandy ENGG*3050 RCS Winter 2014 March 24, 2014 Presented by: Justin Riseborough Albert Tirtariyadi

  2. Content • Introduction • String Lookup Cache • Architectures • System Interaction • Systems comparison • Network Intrusion Detection • Architectures • System Interaction • Implementations • Critique

  3. Keywords • Network processing • String matching • Content Addressable Memory (CAM) & Cache • Bottlenecks • Fixed-Size/Non-Fixed-Size keys • Cascading, propagating • Parallelism

  4. Introduction • String matching are used in search engines, and network intrusion detection • Network processing applications require frequent string matching for specific keywords • As networks gets faster, it becomes more difficult for GPP to keep up • Bottlenecks are found in memory and also in slow implementation algorithms/methods

  5. Software Algorithms Hardware Implementation • Rabin-Karp • Compares hashes of inputs instead of direct character matching • Knuth-Morris-Pratt • Character by character matching; skips non-matching • Boyer-Moore • Uses pre-computed functions to determine shifting distance • Finite automata methods • Translates finite automata graphs to FPGA circuitry • CAMs • Caches and lookup tables • Cellular automata • Finite state machines Current Implementations

  6. Section I String Lookup Cache

  7. String Lookup Cache • Hardware implementation based on CAMs, cellular automaton and caching • Caches retain frequently used values, reducing the need to constantly look up address values • Compatible with parallel processing, prefix sharing and pattern partitioning • Very high throughputs with low area overhead • Drawback of CAMs and hardware caches is the reliance on fixed-size keys • Implementations for non-fixed-size keys requires additional overhead

  8. System Architecture

  9. Content Addressable Memory • Hardware implementation of 2D [associative] arrays/ADT • In VLSI, the cells are transistors • In an FPGA, storage cells are registers, comparators are XOR gates

  10. CAM as Character Match Array (CMA) • Takes characters from the network processor on successive clock cycles • Columns corresponds to a character in keyword • Input character is applied simultaneously to all n columns • Column match signal becomes high if all input bits matches • Storage cell used to indicate end of keyword

  11. Processor Element (PE) Array • An array of finite state machines that carries out the approximate match algorithm • May contain multiple keywords from the CAM • Takes the match signals from the CAM and sets a PE flag which are forwarded to subsequent PEs • Evaluates entire input strings in linear time relative to the size of the input stream

  12. CMA and PE Interaction

  13. Map Table and Outputs • The map table takes the PE# and outputs the address to the value or an indirect pointer to the value object • The map table has as many slots as there are PEs • If words are too long, it can cause holes in the map table

  14. System Interaction

  15. Implementations Comparison

  16. Section II Network Intrusion detection

  17. Network Intrusion Detection • The process of identifying and analyzing packets that may contain threats to the organization’s network • Time consuming process that grows quickly as defined rule-set or signatures grows large • String matching is the most computationally intensive part of the intrusion detection • Every incoming packet is compared against several pre-defined signatures

  18. Problems in the CAM Architecture • CAM-based designs cannot easily handle regular expressions • NIDs signatures are not of a fixed-size • (ie. CAM contains FOO and BAR, input stream is AFOOBARCD. In a 3-character size setup, the comparisons will be made against AFO, OBA and RCD; none of these will match and will slip right through the detection system) • CAM arrays are very large in area

  19. Proposed Solution • Use discrete comparators instead of CAMs • Sacrifices the ability to update signatures dynamically; a fair tradeoff as signatures change relatively infrequently • Use p-rows of comparators for parallelism to match several characters in one clock cycle • Remove the aligned keyword approach as incoming streams may not be aligned to a certain size boundary

  20. System Architecture

  21. Processor Architecture

  22. Processor Architecture

  23. Processor Element Flow • Start at the beginning of the signature • Based on previous PE and current PE • If previous signal and current signal is a match, propagate match signal until end of signature • At the end of the signature, if entire signature match, flag the sig_match output

  24. Signature Match Processor Example • Input string ‘144’ performed over 2 clock cycles • ‘1’ is checked in first cycle, sets off a match signal into the SMA • ‘4’ is checked in second cycle, sets off match signal into the SMA • Match signal for ‘1’ is present from previous clock cycle

  25. Signature Match Processor Example • The ‘4’ is duplicated, so it simply propagates the first match signal to the second as a carry • Since this is the end of the signature, the output is a match due to the propagated match signals && sig_end

  26. Address Output Logic • In order for the SMP to be useful, we also need to know which signatures caused the match • This is handled by the word match buffer, which maintains the position of the signature match • When the last character being processed has been reached, the match address output logic begins working on the buffer entries

  27. Address Output Logic • A binary tree is used for the matching signatures • Decoding starts, and a signal is sent to the control circuitry stating there are matches • A pointer then propagates up the tree, generating a bit of the final address based on matches • Binary trees are fast and efficient, time to process is ~M cycles where M is the number of matches

  28. FPGA Implementation • As parallelism increases, throughput increases, frequency decreases due to complexity • As characters increases, area increases, frequency decreases and throughput decreases

  29. Implementation Comparison

  30. Critique • New terms and unknown works referred to • Difficult to follow in some areas due to inconsistencies and how the topic is presented • Lots of procedure / methodology on implementation • Very detailed works • Good examples to strengthen theoretical explanations • Implementation data given for comparison purposes

  31. Questions?

  32. References • All figures and information used in this presentation pulled from the article • JanardhanSingaraju, John A. Chandy*, FPGA Based String Matching For Network Processing, ScienceDirect Microprocessors and Microsystems, December 14, 2007

More Related