250 likes | 351 Views
Characterizing Flows in Large Wireless Data Networks. Xiaoqiao (George) Meng, UCLA H.Y. Starsky Wong, UCLA Yuan Yuan, UMD Songwu Lu, UCLA. Objectives of Study. To quantitatively characterize data flows in wireless networks To help understand application behavior in wireless networks.
E N D
Characterizing Flows in Large Wireless Data Networks Xiaoqiao (George) Meng, UCLA H.Y. Starsky Wong, UCLA Yuan Yuan, UMD Songwu Lu, UCLA
Objectives of Study • To quantitatively characterize data flows in wireless networks • To help understand application behavior in wireless networks Internet Access Points
Study flow traffic at Access Point Flow Traffic in WLAN • What is a flow? A TCP connection, i.e., a stream of packets Internet
Outline • Motivations • Related work • Three issues • Methodology and datasets • Flow modeling results • Applying models • Summary, future work
Motivations • Realistic flow models to evaluate protocol design • Flow captures user demand and application behavior • Compared to packet-level study, hide effect of MAC and channel errors
Related Work • User mobility & network usage • Tang et. al., Mobicom’99 • Tang et. al., Mobicom’00 • Kotz et. al., Mobicom’02 • Balachandran et. al., Sigmetrics’02 • Balazinska et. al., Mobisys’03 • Modeling wireless networks • Konrad et. al., WINET’03 • Yoon et. al., Mobicom’03 • Jardosh et. al., Mobicom’03
Issue #1: Non-stationarity • Flow arrival pattern changes over time • Smaller number of users • Diurnal cycle of human activities
Internet Dormitory Office Issue #2: Location Dependence • The behavior of flow traffic is location specific
Issue #3: Mobility • A flow may traverse different APs when user is roaming Internet Roaming flow AP2 Static flow AP1 AP3
Methodology • Synthesizing data traces to obtain both temporal and spatial information for flows • Tcpdump: temporal info • Syslog: associated AP info • Statistical methods • Match real data with statistical distributions • quantile-quantile plot • Cross-validation whenever possible • Check model validity using multiple datasets
Datasets • 4 settings: campus, conference, corporate, department
Modeling Flows • How to model static flows at an AP? • Is there spatial correlation for static flows across APs? • How to model roaming flows? Technique, results, interpretation and causes
Modeling Static Flow • Characterized by inter-arrival time and flow data size • Model inter-arrival time • Power spectral analysis • Match against statistical distributions • Regression • Weibull regression model characterizes inter-arrival times at both fine and coarse time granularities • Coexistence of long and short inter-arrival times, effect of time-of-day
Modeling Static Flow – cont. • Model flow data size • Match against statistical distributions • Lognormal distribution gives the best match • There exist flows with large chunk of data • Consistent with file size distribution in Internet (Downey et. al., Sigmetrics’01)
Spatial Correlation Flow inter-arrival time across APs • Use q-q plot to measure similarity • High-degree similarity within same subnet, low-degree similarity across subnets • Users in a geographic proximity more likely exhibit similar usage behavior
Spatial Correlation Flow data size across APs • Use q-q plot to measure similarity • High-degree similarity among all APs • Internet file size distribution does not depend on AP
Cross-validation Using UCSD01 dataset • Flow inter-arrival time: Weibull regression model matches among 78% time intervals • Flow data size: Lognormal distribution gives the best match
Roaming Flows • 0.13% flows are roaming while 60% hosts have mobility • Only 4.6% users generate traffic when they are physically moving • Model roaming flows • Hand-off freq • Residing time
Modeling Roaming Flows • Hand-off freq: match Geometry distribution Hand-off decision process is memory-less • Residing time: match Weibull distribution Both short and long residing times exist
Applying Models in TCP • Compare TCP throughput between • Static scenario used by literature • Realistic scenario: dynamic flow arrivals by using derived models • 40% difference
Applying Models in Scheduling • Flow-level performance of Proportional Fair Scheduling (Borst, Infocom’03) between • Assumed scenario: Poisson arrival and constant flow data size • Realistic scenario: using derived models • Difference of per-flow delay is up to 99%
Summary • Flow-level characterization: understand and quantitatively model wireless flows • Main results • Simple models for both static and roaming flows • Flows in same subnet exhibit spatial similarity • Applying our models yields large performance gap
Future Work • Evolution of models with new traces • Check model validity using latest dataset • Composite packet-level modeling • Compose flow model, packet behavior within a flow and channel model
Thank you ! Flow models available at http://www.cs.ucla.edu/wing
Lessons Learned • Systematic data collection • Not ad hoc data collection • Need to model both temporal and spatial dynamics • Synthesizing traces is useful • Signal processing techniques are useful