1 / 27

Automating Analysis of Large-Scale Botnet Probing Events

Automating Analysis of Large-Scale Botnet Probing Events. Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST) Northwestern University * UC Berkeley / ICSI. Motivation. IPv4 Space. Botnets. Can we answer this question with

gary
Download Presentation

Automating Analysis of Large-Scale Botnet Probing Events

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automating Analysis of Large-Scale Botnet Probing Events Zhichun Li, Anup Goyal, Yan Chen and Vern Paxson* Lab for Internet and Security Technology (LIST) Northwestern University * UC Berkeley / ICSI

  2. Motivation IPv4 Space Botnets Can we answer this question with only limited information observed locally in the enterprise? Enterprise Does this attack specially target us? Administrators

  3. Motivation • Can we infer the probe strategy used by botnets? • Can we infer whether a botnet probing attack specially targets a certain network, or we are just part of a larger, indiscriminant attack? • Can we extrapolate botnet global properties given limited local information?

  4. Agenda • Motivation • Basic framework • Discover the botnet probing strategies • Extrapolate global properties • Evaluation • Conclusions

  5. Botnet Probing Events Big spikes of larger numbers of probers mainly caused by botnets

  6. System Framework See the paper for subtle system details.

  7. Agenda • Motivation • Basic framework • Discover the botnet probing strategies • Extrapolate global properties • Evaluation • Conclusions

  8. Discover the Botnet Probing Strategies • Use statistical tests to understand probing strategies • Leverage on existing statistical tests • Monotonic trend checking: detect whether bots probe the IP space monotonically • Uniformity checking: detect whether bots scan the IP range uniformly. • Design our own • Hitlist (liveness) checking: detect whether they avoid the dark IP space • Dependency checking: do the bots scan independently or are they coordinated?

  9. Design Space

  10. Hitlist Checking • Configure the sensor to be half darknet and half honeynet • Use metric θ= # src in darknet/ # src in honeynet. • Threshold 0.5

  11. Agenda • Motivation • Basic framework • Discover the botnet probing strategies • Extrapolate global properties • Global scan scope, total # of bots, total # of scans, total scan rate for each bot • Evaluation • Conclusions

  12. Extrapolate Global Properties: Basic Ideas and Validation • Observe the packet fields that change with certain patterns in continuous probes. • IPID: a packet field in IP header used for IP defragmentation • Ephemeral port number: the source port used by bots • Increment for a fixed # per scan • Validation • IPID continuity: All versions of Windows and MacOS • Ephemeral port number continuity: botnet source code study • Agobot, Phatbot, Spybot, SDbot, rxBot, etc. • Control experiments with NAT

  13. IPID T Estimate Global Scan Rate of Each Bot • Count the IPID & ephemeral port # changes • Recover the overflow of IPID and ephemeral port number • Estimate the rate with linear regression when correlation coefficient > 0.99 • Counter overestimation: use less of the two

  14. Extrapolate Global Scan Scope IPv4 Space Botnets boti ni=100 Total scans from boti: scan rate Ri * scan time Ti = 100*1000=100,000 Local/global ratio Aggregating multiple bots

  15. Extrapolate Global # of Bots • Idea: similar to Mark and Recapture • Assumption: All bots have the same global scan range • Total M=4000 Bots M • First half m1=1000 • Second half m2=1000 • Observed by both m12= 250 m1 m2 M=m1*m2/m12 m12

  16. Agenda • Motivation • Basic framework • Discover the botnet probing strategies • Extrapolate global properties • Evaluation • Conclusions

  17. Dataset • Based on a 10 /24 honeynet in a National Lab (LBNL) • 293GB packet traces in 24 months (2006-07) • Totally observed 203 botnet probing events • Average observed #bots/event is 980. • Mainly on SMB/WINRPC, VNC, Symantec, MSSQL, HTTP, Telnet • Size of the system: 13,900 lines: Bro (6,000), Python (4,000), C++ (2,500), R (1,400)

  18. Property Checking Results • More than 80% uniform scanning • Validate the results through visualization and find the results are highly accurate.

  19. Extrapolation Results • Most of extrapolated global scopes are at /8 size, which means the botnets do not target the enterprise (LBNL). • Validation based with DShield data • DShield: the largest Internet alert repository • Find the /8 prefixes in DShield with sufficient source (bots) overlap with the honeynet events • Due to incompleteness of Dshield data, 12 events validated • Calculate the scan scope in each /8 based on sensor coverage ratio.

  20. Extrapolation Validation • Define scope factor as max(DShield/Honeynet,Honeynet/DShield) 75% within 1.35 All within 1.5 CDF of the scope factor

  21. Conclusions • Develop a set of statistical approaches to assess four properties of botnet probing strategies • Designed approaches to extrapolate the global properties of a scan event based on limited local view • Through real-world validation based on DShield, we show our scheme are promisingly accurate

  22. Backup

  23. Event size distribution

  24. Extrapolate the scope Probes observed locally Local/global ratio Estimate global probing rate Probing time window

  25. Monotonic trend checking • Goal: detect whether the bots probe the IP space monotonically • E.g. simple sequential probing • Technique: • Mann-Kendall trend test • Intuition: check whether the aggregated sign value (sign(Ai+1-Ai)) out of the range of randomness can achieve. • When most (>80%) senders in an events follow trend we label the events follow trends

  26. Uniformity Checking • Goal: detect whether the botnet scan the IP range uniformly. • Technique: • Chi-Square test • Intuition: put address into bins. The scan observed in each bin should be similar. • Significance level of 0.5%

  27. Dependency Checking • Goal: Is the bots try to get out each other’s way? • Idea: account the number of address receive zero scan and comparing with confidence interval of the independent random case.

More Related