10 likes | 176 Views
Software Defined Measurement for Data Centers Masoud Moshref, Minlan Yu, Ramesh Govindan. Motivation. Software Defined Measurement. Management policies such as Traffic engineering Accounting Troubleshooting Need measurements in Different time-scales Multiple granularities of flows
E N D
Software Defined Measurement for Data Centers Masoud Moshref, Minlan Yu, Ramesh Govindan Motivation Software Defined Measurement • Management policies such as • Traffic engineering • Accounting • Troubleshooting • Need measurements in • Different time-scales • Multiple granularities of flows • Single/Multiple switches • Can we encapsulate the measurement tasks in a controller module? • Select the right primitive at switches • Counters in OpenFlow rules • Sketches in hash-based switches • Sampling (NetFlow, sFlow) • Programmable switches • Based on • Traffic properties (stability) • Measurement task properties • Time-scale • Local/Global view • Use resources efficiently based on • Resource/Accuracy tradeoff Controller TE Accounting SDM Configure resources Fetch statistics Hierarchical Heavy Hitters • Definition: • The longest IP prefixes • That contribute a large amount of traffic (>threshold) • After excluding any HHH descendants in prefix tree • For traffic engineering, accounting, anomaly detection Hierarchical heavy hitter Heavy hitter Flow-based Switches Hash-based Switches • For variable traffic and large time-scale • At switches: sketches: • Multiple hash functions • SRAM counters • Hierarchical Count-Min sketch • At controller: restricted resource allocation • For slowly-varying traffic and large time-scale • At switches: Uses TCAM entries • At controller:pick which prefixes to monitor, given a limit on the number of counters • Max-Cover algorithm • Split the prefix with maximum traffic • Merge siblings with total minimum traffic • Stop if no sibling with traffic < maximum prefix packet H1 H2 H3 H4 w d Programmable Switches Discussion • Because of large control traffic at small time-scales • Give more responsibilities to the switches • The right division of labor between the controller and switches? • Find heavy hitters for each IP prefix length at switches using • Sketches (Count-Min sketch) • Counting algorithms (Space Saving) • How to do a global task? • Multiple switches • Distribute labor on the path of flows • Compose measured data at the controller • Multiple tasks • Distribute resources among tasks • Use joint information to save resources • Multiple time-scales • New primitives for programmable switches • Heap for Space Saving counting algorithm Resource/Accuracy Tradeoff for Flow-based vs Hash-based Switches • Flow-based: Max-Cover • Hash-based: Count-Min sketch • Equal switch resource cost • TCAM 80*SRAM • 80x less bandwidth for flow-based • 2x accuracy for sketch-based for small threshold