150 likes | 475 Views
Data Mining and Intrusion Detection. Alan Hunt Will Fletcher Auburn University. Outline. Intrusion Detection Systems Data Mining Data Mining and Intrusion Detection Data Mining Traffic Analysis to Determine and Predict User Behavior
E N D
Data Mining and Intrusion Detection Alan Hunt Will Fletcher Auburn University
Outline • Intrusion Detection Systems • Data Mining • Data Mining and Intrusion Detection • Data Mining Traffic Analysis to Determine and Predict User Behavior • A Priest, a Rabbi, an Intrusion Detection System, a Data Miner and a Graduate Student Walk into a Bar • The Bartender Says: I’m sorry, we don’t serve miners • Resources • Questions?
Intrusions • Intrusions are actions aimed to compromise the confidentiality, integrity, and/or availability of a computer or computer network. • Solution: Intrusion Detection Systems
Intrusion Detection Systems • Monitors network traffic looking for suspicious activity. • Various approaches • Network based intrusion detection (NIDS) – monitors network traffic • Host based intrusion detection (HIDS) – monitors a single host • Signature based (similar to antivirus software), also known as “misuse detection” • Anomaly detection
Intrusion Detection • Limitations of Signature based IDS • Signature database has to be manually revised for each new type of discovered intrusion • They cannot detect emerging threats • Substantial latency in deployment of newly created signatures • Limitations of Anomaly Detection • False Positives – alert when no attack exists. Typically, anomaly detection is prone to a high number of false alarms due to previously unseen legitimate behavior. • Data Overload • The amount of data for analysts to examine is growing too large. This is the problem that data mining looks to solve. • Lack of Adaptability
Data Mining • Data Mining - Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful)information or patterns from data in large databases [Han and Kamber 2005]. • Data mining is used to sort through the tremendous amounts of data stored by automated data collection tools. • Extracts rules, regularities, patterns, and constraints from databases.
Data Mining Techniques • Association rule mining • Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories. • Sequence or path analysis • looking for patterns where one event leads to another later event • Classification • predicts categorical class labels • classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data
Data Mining Techniques • Cluster analysis • Grouping a set of data objects into clusters. Objects in same cluster are similar. • Forecasting • Discovering patterns in data that can lead to reasonable predictions about the future
Data Mining and Intrusion Detection • Data mining can help automate the process of investigating intrusion detection alarms. • Data mining on historical audit data and intrusion detection alarms can reduce future false alarms.
Data Mining and Intrusion Detection • [Julisch and Dacier 2002] apply data mining to historical intrusion detection alarms to gain “new and actionable insights”. • Insights can be used to reduce the number of future alarms to be dealt with. • Use clustering technique on previously mined knowledge to efficiently handle intrusion detection alarms
Data Mining and Intrusion Detection • Method proposed by Lee, Stolfo, and Mok • Process raw audit data into ASCII network events • Summarize into connection records (attributes such as service, duration, flags, etc.) • Apply data mining algorithms to connection records to compute frequent sequential patterns • Classification algorithms then used to inductively learn the detection models
Data Mining and Behavior • Detecting Behavior • Data mining has been used to predict behavior • Modify these techniques to identify anonymous users on a network • Predict future needs based on past patterns
Data Mining and Behavior • For Example • User A typically creates a lot of ssh traffic to a particular server • User B checks her email and receives large files via FTP after lunch • User C refreshes the slashdot homepage 10 time per minute for 8 hours
Data Mining and Behavior • Research Questions • Can this behavior be correctly predicted? • Can users be differentiated based solely on network traffic?
References • Intrusion detection: Specification-based anomaly detection: a new approach for detecting network intrusions R. Sekar, A. Gupta, J. Frullo, T. Shanbhag, A. Twat, H. Yang, S. Zhou November 2002 Proceedings of the 9th ACM conference on Computer and communications security • Industry track papers: Mining intrusion detection alarms for actionable knowledge Klaus Julisch, Marc Dacier July 2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining • Detecting intrusions using system calls: alternative data modelsWarrender, C.; Flicker, S.; Pearlmutter, B.; Security and Privacy, 1999. Proceedings of the 1999 IEEE Symposium on , 9-12 May 1999 Pages:133 - 145 • Mining in a data-flow environment: experience in network intrusion detection Wenke Lee, Salvatore J. Stolfo, Kui W. Mok August 1999 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining • ADMIT: anomaly-based data mining for intrusions. Karlton Sequeira and Mohammed Zaki; Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. Pages: 386 – 395. 2002 • www.cs.sfu.ca/~han/bk/1intro.ppt • http://netsecurity.about.com/cs/hackertools/a/aa030504.htm • http://www.sans.org/resources/idfaq/host_based.php • http://www.symantec.com/symadvantage/016/pad.html