1 / 17

Netflow Data-Mining Techniques

Netflow Data-Mining Techniques. Chris Poetzel Argonne National Laboratory cpoetzel@anl.gov Scott Pinkerton. Netflow Data Mining. Argonne Background Information Sliding Window Analysis Using Contextual Knowledge to adjust data-mining Incident Investigation

zenobia
Download Presentation

Netflow Data-Mining Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory cpoetzel@anl.gov Scott Pinkerton

  2. Netflow Data Mining • Argonne Background Information • Sliding Window Analysis • Using Contextual Knowledge to adjust data-mining • Incident Investigation • Integration, Integration, Integration • Future • Conclusions ESCC Meeting

  3. ANL Background • Utilize OSU’s Flow-Tools written by Mark Fullmer • Collecting from 14 different Router/Switches at ANL-East • ~600GB currently stored and growing • 1 Year retention period desired – backing off as we add devices • Current collection/Analysis Station: IBM 360, RedHat Linux, 8GB Ram, 4 1.6 Mhz CPU ESCC Meeting

  4. Sliding Window Analysis • The raw volume of Netflow Data can make data-mining long and cumbersome • Implemented a 5 minute Sliding Window for analysis • Every minute, check previous 5 minutes of data (via cron jobs) • Reduces processing time (~20 secs) • Catches vast majority of scans/probes in near real-time ESCC Meeting

  5. Contextual Knowledge • Which way is the data flowing? • Contextual knowledge will affect what we search for & what we do with the results Destination Source ESCC Meeting

  6. OUT -> IN • OUT -> IN • Receive many class B/C scans a day • Only Watch for scans on open FW ports • Dynamically read FW config every ½ hour to determine open ports in FW • Use Netflow Data to look for scans on open FW ports • Fast Scans: Script executed every minute looking at past 5 minutes of data to catch Fast Scanners • Slow Scans: Script run every hour looking at previous 24 hours of data to catch Slow Scanners • Once scanner detected, send IP for FW shun ESCC Meeting

  7. IN -> OUT • Looking for problem machines at the Lab – 1st approximation is to look at machines which have contacted large # of Internet hosts in a short period of time • Can indicate a compromised/infected machine • Exclude a number of internal machines based on apriori knowledge • email servers, domain controllers, network scanning machines (ignore) ESCC Meeting

  8. IN -> IN • Requires collection on multiple internal switches/routers • Detect Internal Scanning • Cron job runs every hour • Infected host scanning local subnet/supernet • Detect unauthorized internal network scans • Post-Mortem Forensic Value • What did an internally compromised machine do once it was compromised • Track down cross-contamination ESCC Meeting

  9. OUT -> OUT • May not apply to every site • Co-location personal or transport traffic constitute OUT -> OUT traffic on a network • Scans in the OUT <-> OUT direction are detected and the appropriate network admin/security personal are notified ESCC Meeting

  10. Incident Investigation 1/2 • What to do when an incident happens? (Besides pull your hair out) • Netflow Data is invaluable in cyber security investigations. • Start by classifying IP addresses into a taxonomy • Possible Bad Guy • Possible Victims • Possible Intermediary (stepping stone, rootkit resource site, etc) • This process can be aided by host syslog, etc. ESCC Meeting

  11. Incident Investigation 2/2 • By identifying the possible victims, the process of containment and clean-up becomes much easier • Netflow has become an invaluable tool for our cyber security team ESCC Meeting

  12. Integration³ • To improve Signal-to-Noise ratio of cyber security events, correlating netflow data with other data sources has been very helpful • IDS logs • ARP/CAM Tables – MAC “persistence” • Firewall Logs • DHCP/VPN Logs • Host based Syslog ESCC Meeting

  13. IDS & Netflow Logs • Used to cross validate either an IDS or a Netflow alarm with each other • IDS alarms usually give specific points of attack • Netflow can be used to provide background or framework of attack • Netflow + IDS can provide a better perspective of cyber security events • Store IDS and Netflow Logs in same directory structure to make searching easier ESCC Meeting

  14. VPN/DailUP Scan/Virus Detection • Marriage of Many Data Sources • Each Dailup/VPN login initiates a virus scan of connected host • Dailup/VPN connected host is monitored via netflow for outbound scanning activity • If remotely connected host is determined to be virally infected or doing malicious behavior, connection is terminated and user account is locked • All actions are performed via automated scripts, no human intervention ESCC Meeting

  15. Future • Host Profiling Via Netflow • Determine what “normal” behavior for a host is and then alert when it varies from the norm • Some IDS products are attempting this approach (Network Flight Recorder, Lancope) • Visualization of Netflow Data • Charts, Graphs, Animations of Network Conversations • Work Being done by NCSA • Better Integration with other data sources ESCC Meeting

  16. Conclusions • Collecting Netflow data to support Cyber Security activities is tremendously helpful. • It is an invaluable data source for performing post-mortem forensic analysis, as well as an extremely helpful tool for performing real-time detection, notification, and activeresponse – blocking an IP address. ESCC Meeting

  17. Thanks • Chris Poetzel • cpoetzel@anl.gov • 630-252-7431 • Scott Pinkerton • pinkerton@anl.gov • 630-252-9770 ESCC Meeting

More Related