Preventing Encrypted Traffic Analysis

6 January 2011 Preventing Encrypted Traffic Analysis Nabíl Adam Schear University of Illinois at Urbana-Champaign Department of Computer Science nschear2@illinois.edu Committee: Nikita Borisov (UIUC-ECE) Karen L. Bintz (Los Alamos National Laboratory) Matthew Caesar (UIUC-CS) Carl A. Gunter (UIUC-CS) David M. Nicol (UIUC-ECE) UNCLASSIFIED Open Release LA-UR 11-01317 For a copy of the slides or dissertation: http://helious.net/

Encrypted Traffic Analysis • Encrypted protocols provide strong confidentiality through encryption of content • But – encryption does not mask packet sizes and timing • Privacy can be breached by traffic analysis attacks • Traffic analysis can recover: • Browsed websites through encrypted proxies • Keystrokes in real-time systems • Language and phrases in VoIP • Identity in anonymity systems • Embedded protocols in tunnels • Who needs defense against traffic analysis? • Privacy seeking users and enterprises with VPNs, SSL, VoIP, anonymity networks etc.

SSH Traffic Analysis Attack By keystroke timing, Bob typed: U-I-U-C Corporate Network Internal Server A SSH Gateway 5452323 4JA123 2542234 dejfwo Lfowjf 2394h 247jfwo lf2uenql 2394h 237jfwo lf2uenql 2394h Bob starts login to A Bob logs into SSH gateway Bob types password for A: UIUC Bob’s Computer

Defense Detection Attack Traffic does not match real HTTPS Signature: TOR Tor Relay Network Private Tor Bridge HFA0adfalkjU 4;KDJA23ADK 542542342AF HFA0adfdsfaalU213 sdfsdf23ADasdfaK 54251234242342AF HFA0adfalkjU 4;KDJA23ADK 542542342AF Port 443 Alice wants to use Tor from China

Approach:Realistic Mimicry with TrafficMimic • Tunneling real data over cover traffic • Force attacker to see cover packet sizes and timings not real ones • Attacker cannot separate real data from cover padding because of encryption • Use realistic model to generate cover traffic • Simultaneously prevent both types of attack • Less overhead and vulnerability than constant rate techniques

Thesis Statement “Tunneling real data through realistic cover traffic models is a robust defense that provides balanced performance and security against powerful traffic analysis attacks.”

TrafficMimic Goals • Offer the user choices for performance versus security trade-offs • Quantify risk and performance gain • Robustly defend traffic analysis and defense detection attacks against a powerful adversary • Favor adversary with resources and access • Retain realism, practicality, and usability in implementation and evaluation • Ensure that real users can benefit from TrafficMimic

Outline • Introduction • TrafficMimic design and implementation • Independent cover traffic evaluation • Simulation and modeling performance study • Improving performance with biasing • Conclusions and future work

Cover Traffic Tunneling Timing changed header2 Tunnel Tunnel data EB header2 Google header1 EA data Tunnel header EB data EA header1 no longer visible User padding data size not leaked

Constant Rate Cover Traffic • Current state-of-the-art defense • Effective at masking real protocol activity • Drawbacks: • Overhead in excess bytes and delay • Vulnerable to defense detection • Packet drops identify the flow in mix systems

Realistic Cover Traffic • Goal 1: Generate cover traffic tunnel that prevents attacker from recovering tunneled protocol • Goal 2: Generate traffic of protocol X that is indistinguishable from real X traffic Models Cover Network Data Proxy Proxy dejfwo Lfowjf 2394h 545323 4JA23 254223 Real Traffic HFA0adfdsfaalU213 sdfsdf23ADasdfaK 54251234242342AF HFA0adfalkjU 4;KDJA23ADK 542542342AF Learning Playback

Requirements for Secure Traffic Generation • Network agnostic • Closed-loop, TCP-based, reactive to live network conditions • Use Swing [Vishwanath06] to capture empirical distributions of protocol features from a network trace • Quality training • Sanitize training network trace data of anomalies • Care in selecting training data to avoid training-based attacks • Synchronized playback • Secure cover traffic generation is bidirectional • Requires both statistical and heuristic consistency • Single node controls all traffic generation

Tunnel uses SOCKS/HTTP or port forwarding Cover traffic specified by type, size, timing Master controls all cover traffic generation Real data encoded into messages Message data split into fragments that contain message chunk, model traffic, and padding TrafficMimic Design

Tor Integrations • Protects individual Tor SSL links from traffic analysis • OP to OR links OR to OR links OP to Bridge links • Addresses blocking resistant design [Dingledine06] • No code changes required to any Tor components • Integrate TM end-to-end in Tor with small code change to OR • Can advertise TM ability through existing Tor directory Solves Alice’s Tor Problem

Evaluation Setup • System Implementation tested on real Internet wide-area links • Montreal (MN) to London (UK) • Urbana (IL) to San Diego (CA) • Cover protocol models: • HTTPS, SMTP, and SSH • Fixed 28kbps constant rate model • Three sets of network traces (~1 hour each) • CAIDA, Jan/Feb 2009, equinix-chicago monitor • UNC, April 20/29, 2003, campus border link • LANL, Sept 2009, external gateway link

Protocol Classification Attack • Distill connections into network agnostic feature vectors • Feature selection based, in part, on prior work • Use supervised learning to identify unknown protocols • Weighted K-nearest neighbor algorithm for classification, K=3 • Inter-cluster and distance threshold anomaly detection schemes CAIDA/CAIDA CAIDA/LANL • First to evaluate realistic attack • Accuracy suffers ~10% with different test network Consistent with [Wright06]

Classifying Cover Traffic • 73% accuracy fooling classifier with realistic traffic • 91% of best rate for real traffic • Constant rate schemes easily detected with anomaly algorithms • What if we let the attacker train on TrafficMimic generated traffic with a binary classifier? • With independent HTTPS training/test sets and Internet Links • Accuracy: 49.4%(worse than random guessing)

100KB Transfer over Cover Traffic • Need to put real load on cover traffic to test performance • HTTPS-resp and SMTP high bandwidth, low overhead • HTTPS-req and SSH asymmetric bytes sent/received • poor performance and high overhead

Web Site Load Time Slowdown • HTTPS provides best real/cover match • SMTP (not shown) very poor • due to long wait times and relatively small response stream • Performance impact is considerable when compared with native

Performance Study • Deeper understanding of performance with simulation/modeling • Questions • Cover traffic tunneling impact on user experience? • Overhead compared to transmitting without cover traffic? • Dependence on relationship between real and cover traffic? • We assess these impacts with: • tunnel-free network properties derived from simulation • real trace-driven protocol models • analytic models of cover traffic tunneling

Simulation Model • Use bidirectional model of traffic based on HTTP • Collect real HTTP traffic patterns from UNC traces • Requests are normally distributed • Responses are not, heavy tails • Use clustering to create response categories Simulated Network • Larger cover sizes have higher TCP efficiency • Conversely, large cover sizes delay real traffic • On startup waiting for next cover to begin • On final chunk waiting for end 100 Mbit/s 50 ms delay 1.5 Mbit/s 20 ms delay 1.5 Mbit/s 20 ms delay client server

Developing an Analytic Model • Create bidirectional HTTP-like model based on • On-Off renewal processes • How do we model the real delay of sending data? • Three components: startup, request, response

Model Validation Model error: 3.57% Simulation error: 3.97% • Use tunnel-free network data from simulation to validate tunneled response delay • Need to make some simplifying assumptions about data • Normally distributed responses • Exponentially distributed inter-session timing • Capture all network link properties with single transmission cost Larger real load yields higher model accuracy

Investigating Slowdown • Use Discrete-time Markov chain • Find Slowdown: steady-state probability for real transmission conditioned on availability of real • Decreasing function of cover session utilization • Real size larger than cover yields higher slowdown • Increasing cover sizes with high utilization yields best performance

Performance Observations • Best when there is plenty of cover traffic to carry real • Similar to intuition for constant rate cover traffic • But mismatches between real and cover are painful • Cover too small, wait time hurts • Cover too large, real traffic has to wait for next cover to begin • Even when size ratio is favorable low utilization yields high slowdown • Waiting times dominate effects of padding and network transmission

Performance Enhancements • So far: prevent attack using independent cover traffic • Attacker inference cannot recover information from real flow • Can we relax strict independence without sacrificing security? • Against both traffic analysis and defense detection • Need to quantify security impact • Concept: influence the traffic generation process withbiasing

User Sessions Application Protocol Connections Recall: Model-Based Traffic Generation • Collect ECDF of structural features • Sample features using: • uniform [0,1] random variable • inverse transform sampling • Bias parameter selection given current real state • Biased samples still come from empirical data Swing Feature ECDFs Bias Optimized sample Real state

Biasing: Two Techniques • Functional • Create probability distribution that parameterizes biasing effect • Derive CDF by integration and invert to select samples • Algorithmic • Select optimized structural sample with iterative algorithm • Goals: • Avoid splitting real objects • Minimize overhead Minimize waiting

Functional Biasing Example:Probability Split • PDF given by • x0 is the optimal point, perfect biasing • Given by the current/recent needs of the real traffic • Derive inverse CDF F-1 of distribution by integration • Parameter p controls height of left side of x0 • standardize bias factor relation to p

Functional Biasing Distributions

Algorithmic Biasing • Directly try to achieve biasing goals with algorithm • Try Try Again (TTA) • Select sample from empirical CDF • If greater than real buffer size, use it • If not, repeat up to R times • Linear algorithm takes first sample that does not split • Optimal algorithm finds minimum of all R samples that does not split

Algorithmic Biasing Illustrated Unbiased Geometric Optimal value shown in red Linear TTA Optimal TTA

Simulator • Simulate 20,000 real objects being sent by a variable number of cover sizes • Estimate performance with • Number of cover sessions (num splits) • Excess padding needed in final cover (overage) • Simulation outputs results in tuples: • Denote random variables corresponding to these values Qsim and Csim

Attacking Biasing • Deduce the real size given observation of cover size using: • Bayesian inference • Maximum likelihood estimation • Real attack contains history of multiple cover sizes • Approximate real attack by inferring estimate qestfrom individual ciobservations • Theoretical attack used for quantifying security impact of biasing • Performance in practice does not recover actionable information (acc ~5%)

Quantifying Information Leakage • Use information theory: Mutual Information • Intuitively: given two RV Qsim and Qest: measure how much knowing one variable reduces our uncertainty about the other • Measured in bits per sample pair • Mathematically:

Mutual Information Estimator Our Bayesian inference: where m represents the additional information about dist of Csim Now consider a simplified inference function: Qest has strictly more information than Qg, thus: MI computed from ci order of magnitude faster to compute

HTTPS Biasing Performance • Biasing can improve performance by factor of 2 • Significantly fewer splits than constantMTU • Information leakage higher with linear functional techniques • Lowest MI with algorithmic and expBias Security Performance

SSH over SMTP Responses • Swapping causes large information leakage in exchange for huge performance improvement • SMTP large request stream, small protocol-only response stream • SSH smaller request stream, variable response stream • Causes SMTP protocol model to “swap” sent/received ratio

HTTP Request over HTTP Response • Over provisioned cover model • Minimal impact on session performance • Several techniques can reduce overhead

Defense Detection • Recall: using realistic models prevents adversary from detecting TrafficMimic • But biasing destroys pristine cover model! • Solution: sample without replacement within a windowof L samples • Replenish when subset is exhausted • Distribution is consistent with original within L observations

Defense Detection Attack:Kolmogorov-Smirnov Test • Attacker checks observed distribution with window W • make determination about defense detection quicker • Attacker detection advantage when L>W • Defender performance advantage with larger L • VaryL and show maximum attack W with 95% confidence of proving that observed distribution is abnormal • Because of noise in real distributions, attacker must limit W to 50 • While defender can safely use up to L=300

Performance of Limiting L OptimalTTA on HTTPS • Larger than expected improvement even with very small L • Performance continues to improve modestly • 7.8% fewer session at L=5000 • Conservatively use L=100 for further experiments

Sampling Without Replacement • Much lower information leakage compared to sampling with replacement • Very low Bayesian attack accuracy • Performance improvement due to splitting avoided when subset is still large (early optimization) • 55% increase in splits without replacement

Trade-Offs • Simulations can help drive traffic combination choices • Relative importance of MI, overhead, and splits varies

Bias Implementation • Critical parameter to biasing: current real state • Master needs slave and local event loop buffer state to select biased traffic parameters • Maintain synchronous control of traffic generation at master • Feedback • Report real buffer state to master model thread through traffic confirmations used for model synchronization • Potential for stale information, especially from slave • Split Biasing • Pre-sample R samples from distribution in model thread and defer parameter selection until just before transmission • For algorithmic techniques only where R is relatively small

Bulk Transfer over HTTP-resp • Linear algorithms can best use distribution tails • Due to x0~=1 for much of the transfer • Algorithmic approaches have smaller working set (<10) • Still provide modest improvements in bandwidth • Safest approaches for unknown/variable traffic combinations

Bulk Transfer with OptimalTTA • 3.5-5.5x improvement in bandwidth • Small SSH request stream not well suited to bulk transfer • Bidirectional overhead reduction for HTTPS and SSH • SMTP already minimal overhead because of small response stream

Preventing Encrypted Traffic Analysis

Preventing Encrypted Traffic Analysis

Presentation Transcript

Preventing

Traffic Flow Analysis

Querying Encrypted Data

DTRAB Combating Against Attacks on Encrypted Protocols through Traffic-Feature Analysis

Statistical Identification of Encrypted Web-Browsing Traffic

Encrypted SNI: Threat Model Analysis

Preventing

Traffic Analysis Prevention

Conditional Encrypted Mapping and Comparing Encrypted Numbers

Traffic Capacity Analysis Concepts

a traffic analysis tool

TRAFFIC DEMAND ANALYSIS

ENCRYPTED QUERY PROCESSING

Manipulating Encrypted Data

Performance Analysis of Real Traffic Carried with Encrypted Cover Flows

Querying Encrypted Data

9 Traffic Analysis tools

Encrypted traffic management

PREVENTING

Traffic Analysis with Ethereal