MKT 700 Business Intelligence and Decision Models

MKT 700Business Intelligence and Decision Models Week 8: Algorithms and Customer Profiling (1)

Classification and Prediction

SPSS Direct Marketing

SPSS Analysis

Major Algorithms

Euclidean Distance

Euclidean Distance for Continuous Variables • Pythagorean distance  √d2 =√(a2+b2) • Euclidean space  √d2 =√(a2+b2+c2) • Euclidean distance  d=[(di)2]1/2

Pearson’s Chi-Square

Contingency Table

Observed and theoretical Frequencies

Chi-Square:

.10 .05 3.032 6.251 7.815 Statistical Inference • DF: (4 col –1) (2 rows –1) = 3

Log Likelihood Chi-Square

Log Likelihood • Cluster distance on probability distributions • Applicable to both categorical and continuous variables

Contingency Table (Observed Frequencies)

Contingency Table (Expected Frequencies)

Chi-Square: p < 0.05; DF = 1; Critical value = 3.84

Log Likelihood Distance & Probability

ANOVA, F Statistics

F-Statistics • For metric or continuous variables • Compare explained (in the model) and unexplained variances (errors)

ANOVA • Group Comparisons: Are errors (discrepancies between observations and the overall mean) explained by group membership or by some other (random) effect?

Variance SS is Sum of Squares DF = N-1 VAR=SS/DF SD = √VAR

OnewayANOVA

MSS(Between)/MSS(Within)

ONEWAY (Excel or SPSS)

Profiling

Customer Profiling • Who is likely to buy or not respond? • Whois likely to buy what product or service? • Who is in danger of lapsing?

Profiling/Decision Tree • SPSS Direct Marketing  Customer Profiling • SPSS Analysis  Classification  Decision Tree • CHAID (Chi-Square Automatic Interactive Detector) • CART (Classification and Regression Tree)

Use of Decision Trees • Classify observations from a target binary or nominal variable Segmentation • Predictive response analysis from a target numerical variable Behaviour • Decision support rules  Processing

Decision Tree

Example:dmdata.sav Underlying Theory  X2

CHAID AlgorithmSelecting Variables • Example • Regions (4), Gender (3, including Missing)Age (6, including Missing) • For each variable, collapse categories to maximize chi-square test of independence: Ex: Region (N, S, E, W,*)  (WSE, N*) • Select most significant variable • Go to next branch … and next level • Stop growing if …estimated X2 < theoretical X2

CART (Nominal Target) • Nominal Targets: • GINI (Impurity Reduction or Entropy) Squared probability of node membership Gini=0 when targets are perfectly classified. Gini Index =1-∑pi2 • Example • Prob: Bus = 0.4, Car = 0.3, Train = 0.3 • Gini = 1 –(0.4^2 + 0.3^2 + 0.3^2) = 0.660

CART (Metric Target) • Continuous Variables: Variance Reduction (F-test)

Comparative Advantages(From Wikipedia) • Simple to understand and interpret • Requires little data preparation • Able to handle both numerical and categorical data • Uses a white box model easilyexplained by Boolean logic. • Possible to validate a modelusing statistical tests • Robust

Where to get help? http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index.jsp

Top line from Chapter 13 -1 • Analytics helps you to predict which recipients of your direct mail will buy your products, and which are not likely to buy. At $500 per thousand pieces, analytics can save you a lot of money. • Analytics is not as useful for e-mail marketing. The cost of appending data and the modeling often results in a loss, since the cost of mailing is only $6 per thousand. • Predictive models are based on previous promotions. You add demographic data (age, income, value of home, etc.) to a sample of your file and determine the differences between responders and non-responders. • Predictive modeling uses multiple regressions. It results in an algorithm—a mathematical formula that can be used to “score” any direct mailing file that has demographics appended, and predict, before you mail, which ones are going to respond. • Modeling does not always work. Sometimes what makes people buy is not based on demographics.

Top line from Chapter 13 -2 • Analytics can be used to reduce unsubscribes. If you have done LTV and know the value of your subscribers, you can calculate how much analytics would save you by not mailing unwanted material to some subscribers. • Very few e-mail marketers are doing any predictive modeling today, with good reason. • Direct mail gets higher response rates than e-mail partly because the shelf life of a direct mail piece or catalog can be weeks or months. An e-mail’s shelf life is one day or less. • Modeling can be useful for cross-sales—determining what other products your customers might buy. • Next-best product analytics and churn predictive analytics can be very profitable.

Top line from Chapter 13 -3 • CHAID is very useful for dividing your database into segments containing people with different interests and response rates. • Descriptive analytics is useful for advertising campaigns, but seldom useful for direct mail. • Clickstream data analysis can be very useful in planning the layout of a Web site or an e-mail. • Key performance indicators (KPIs) can help you determine the relative success of e-mail programs.

MKT 700 Business Intelligence and Decision Models

MKT 700 Business Intelligence and Decision Models

Presentation Transcript

Desktop Business Analytics -- Decision Intelligence

Decision Support Systems Lecture II Business Intelligence

Business Intelligence /Decision Models

Decision Support and Business Intelligence Systems

Business Intelligence Solutions for Improved Decision Making

Business Intelligence and Decision Modeling

Mobile Commerce Business Models , Business intelligence

Decision Models

Business Intelligence/ Decision Models

Business Intelligence/ Decision Models

Business Decision Models BU/EC275

Business Intelligence and Decision Modeling

DECISION SUPPORT AND ARTIFICIAL INTELLIGENCE Brainpower for Your Business

Chapter 1: DECISION SUPPORT SYSTEMS AND BUSINESS INTELLIGENCE

MKT 700 Business Intelligence and Decision Models

Business Intelligence and Decision Modeling

Decision Support and Business Intelligence Systems

Business Decision Models BU/EC275

Business Decision Models BU/EC275

Decision Models

Decision Support and Business Intelligence Systems

Decision Support and Business Intelligence Systems