1 / 20

Data Mining

Data Mining. Enterprise systems infrastructure and architecture DT211 4. Note for next year Data mining in Laudon and laudon and Kopec paper gives a few more good ideas. Data Mining.

anthea
Download Presentation

Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining Enterprise systems infrastructure and architecture DT211 4

  2. Note for next year Data mining in Laudon and laudon and Kopec paper gives a few more good ideas

  3. Data Mining • The process of extracting valid, previously unknown, comprehensible, and actionable information from large databases and using it to make crucial business decisions. • Involves the analysis of data and the use of software techniques for finding hidden and unexpected patterns and relationships in sets of data.

  4. Data Mining • Data mining tools uses ,e.g. AI techniques, to help: • predict future trends: , • Segment datasets • “Product” association • allowing businesses to make proactive, knowledge-driven decisions.

  5. Data mining: A.I. techniques. • The most commonly used techniques A.I. techniques in data mining are: • Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. • Nearest neighbour method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset. Sometimes called the k-nearest neighbour technique; a clustering technique • Rule induction: The extraction of useful if-then rules from data based on statistical significance. • Artificial neural networks: Predictive models that learn through training and resemble biological neural networks in structure.

  6. How Data Mining Works • For example, say that you are the director of marketing for a insurance company and you'd like to acquire some new customers • You could just randomly go out and mail coupons to the general population. However you would not achieve the required result. • Alternatively As the marketing director you have access to a lot of information about all of your customers: their age, sex, income range and credit card insurance.

  7. How Data Mining Works • The goal in prospecting is to make some decisions about the information in the lower right hand quadrant based on the model that we build going from Customer General Information to Customer Proprietary Information.

  8. An Algorithm for Building Decision Trees Consider the following using decision trees. The following is decision tree algorithm: 1. Let T be the set of training instances.2. Choose an attribute that best differentiates the instances in T.3. Create a tree node whose value is the chosen attribute. -Create child links from this node where each link represents a unique value for the chosen attribute. -Use the child link values to further subdivide the instances into subclasses. 4. For each subclass created in step 3: -If the instances in the subclass satisfy predefined criteria or if the set of remaining attribute choices for this path is null, specify the classification for new instances following this decision path. -If the subclass does not satisfy the criteria and there is at least one attribute to further subdivide the path of the tree, let T be the current set of subclass instances and return to step 2.

  9. How Data Mining Works • For instance, a simple model for a • Insurance company might be: • Customers who earn between 50 K to 60 K have a life insurance policy. • This model could then be applied to the general population to target those for the life insurance promotion. • The tree can be more complex e.g. See figure opposite

  10. Data Mining Operations • Data mining operations include: • Predictive modelling: decision trees, regression analysis… • Database segmentation: clustering techniques • Link analysis: decision trees, association rules

  11. Predictive Modeling Simple decision tree example • Applications of predictive modelling include direct marketing and use techniques like decision trees. • uses observations to form a model of the important characteristics of some phenomenon: e.g. those traits associated with those who will buy property

  12. Database Segmentation • Aim is to partition a database into an unknown number of segments, or clusters, of similar records. • Uses clustering techniques in order to group data • Applications of database segmentation include credit card fraud….

  13. Database Segmentation using a Scatterplot

  14. Link Analysis • Aims to establish links between records, or sets of records, in a database; one such example would be association discovery…. • Applications include product affinity analysis. • Finds items that imply the presence of other items in the same event.

  15. Link Analysis - Associations Discovery • Affinities between items are represented by association discovery. • e.g. ‘When a customer rents property for more than 2 years and is more than 25 years old, in 40% of cases, the customer will buy a property. This association happens in 35% of all customers who rent properties’.

  16. Examples of Applications of Data Mining • Retail / Marketing • Predicting response to mailing campaigns • Market basket analysis • Banking: • Detecting patterns of fraudulent credit card use. • Insurance • Claims analysis • Medicine • Identifying successful medical therapies for different illnesses

  17. Data mining in conclusion • Two critical factors for success with data mining are: • a large, well-integrated data warehouse and • a well-defined understanding of the business process within which data mining is to be applied (e.g. customer prospecting, retention, campaign management etc.).

  18. Sample types questions • Discuss, using suitable examples how data mining can contribute to companies making a proactive knowledge driven decisions which could help with formulation of a companies strategy.

More Related