1 / 14

Data Mining: An Introduction

Data Mining: An Introduction. Wing Kee Ho Xiaohua Luan. Outline of Today Presentation. Definition of data mining Comparison of Data mining vs. DBMS Sample data mining tasks in daily life Data mining development. Definition of Data Mining.

Download Presentation

Data Mining: An Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining: An Introduction Wing Kee Ho Xiaohua Luan

  2. Outline of Today Presentation • Definition of data mining • Comparison of Data mining vs. DBMS • Sample data mining tasks in daily life • Data mining development

  3. Definition of Data Mining The nontrivial extraction of implicit, previously unknown, and potentially useful information

  4. Why do we need to mine the data? • Too much data and too little information • There is a need to extract useful information from the data and to interpret the data

  5. DBMS Query SQL Output Precise Subset of database Data Mining Query Not precise query lang Output Fuzzy Not in Subset of database Comparison of DBMS and Data Mining

  6. Data Mining or DBMS? • Last months sales for each product • The profit forecast on next month • List of customers who lapsed their policy • The characteristics of customers who lapsed their policies

  7. Sample Data Mining Example • Association Rules • Clustering • Time-Series Forecasting

  8. Association Rule --- using Harris Teeter as an example

  9. Association Rule, cont • Objective: Identify items that occur together • Support of {salsa, chip} is 80%, • Support of {bread, milk} is 60% • Data is useful for shelving, merchandizing, and pricing.

  10. Each point represent the characters of a customer Clustering-- market segmentation as an example

  11. Clustering, cont • Objective: grouping members that have similar characteristics together • Widely applied on fraud detection, business and finance, science

  12. Statistical Analysis • Regression: • Time Series: Housing Price Area (sq.feet) Sales Volume Time

  13. More data mining techniques • Decision Tree • Neural Network • Combination of several data mining techniques

  14. Implications for different interest parties: • Database users: --- new skills to explore to secure your job! • Database developers: --- develop new functions, and better interface • General Public: --- less privacy, or more convenient?

More Related