Regression Analysis

Regression Analysis Module 3

Regression Dependent variable Independent variable (x) Regression is the attempt to explain the variation in a dependent variable using the variation in independent variables. Regression is thus an explanation of causation. If the independent variable(s) sufficiently explain the variation in the dependent variable, the model can be used for prediction.

Simple Linear Regression y’ = b0 + b1X ± є є Dependent variable (y) B1 = slope = ∆y/ ∆x b0 (y intercept) Independent variable (x) The output of a regression is a function that predicts the dependent variable based upon values of the independent variables. Simple regression fits a straight line to the data.

Simple Linear Regression Observation: y ^ Prediction: y Dependent variable Zero Independent variable (x) The function will make a prediction for each observed data point. The observation is denoted by y and the prediction is denoted by y. ^

Simple Linear Regression Prediction error: ε Observation: y ^ Prediction: y Zero For each observation, the variation can be described as: y = y + ε Actual = Explained + Error ^

Regression Dependent variable Independent variable (x) A least squares regression selects the line with the lowest total sum of squared prediction errors. This value is called the Sum of Squares of Error, or SSE.

Calculating SSR Population mean: y Dependent variable Independent variable (x) The Sum of Squares Regression (SSR) is the sum of the squared differences between the prediction for each observation and the population mean.

Regression Formulas The Total Sum of Squares (SST) is equal to SSR + SSE. Mathematically, SSR = ∑ ( y – y ) (measure of explained variation) SSE = ∑ ( y – y ) (measure of unexplained variation) SST = SSR + SSE = ∑ ( y – y ) (measure of total variation in y) ^ 2 ^ 2

The proportion of total variation (SST) that is explained by the regression (SSR) is known as the Coefficient of Determination, and is often referred to as R . R = = The value of R can range between 0 and 1, and the higher its value the more accurate the regression model is. It is often referred to as a percentage. 2 2 SSR SSR SST SSR + SSE 2 The Coefficient of Determination

√ Standard Error of Regression The Standard Error of a regression is a measure of its variability. It can be used in a similar manner to standard deviation, allowing for prediction intervals. y ± 2 standard errors will provide approximately 95% accuracy, and 3 standard errors will provide a 99% confidence interval. Standard Error is calculated by taking the square root of the average prediction error. SSE n-k Standard Error = Where n is the number of observations in the sample and k is the total number of variables in the model

The output of a simple regression is the coefficient β and the constant A. The equation is then: y = A + β * x + ε where ε is the residual error. β is the per unit change in the dependent variable for each unit change in the independent variable. Mathematically: β = ∆ y ∆ x

Multiple Linear Regression More than one independent variable can be used to explain variance in the dependent variable, as long as they are not linearly related. A multiple regression takes the form: y = A + β X + β X + … + β k Xk + ε where k is the number of variables, or parameters. 1 1 2 2

Multicollinearity Multicollinearity is a condition in which at least 2 independent variables are highly linearly correlated. It will often crash computers. A correlations table can suggest which independent variables may be significant. Generally, an ind. variable that has more than a .3 correlation with the dependent variable and less than .7 with any other ind. variable can be included as a possible predictor.

Nonlinear Regression Nonlinear functions can also be fit as regressions. Common choices include Power, Logarithmic, Exponential, and Logistic, but any continuous function can be used.

Regression Output in Excel

Conjoint Analysis • Conjoint Study Analysis is concerned with understanding how people make choices between products or services or a combination of product and service, so that businesses can design new products or services that better meet customers underlying needs. The fulfillment of customers, wishes in a profitable way requires that companies understand which aspects of their product and service are most valued by the customer. Conjoint analysis is considered to be one of the best methods for achieving this purpose. It consists of generating and conducting specific experiments among customers with the purpose of modeling their purchasing decision. Our techniques compute mathematical values to explain consumer behavior - how much value is placed on price, or location, or features, etc. and then correlates this data to demographic, lifestyle, or other consumer profiles. A software-driven regression analysis of data obtained from actual customers makes accurate reporting and analysis possible.

Survey Analytics Conjoint Module • Conjoint analysis is used to study the factors that influence customers, purchasing decisions. Products possess attributes such as price, color, ingredients, guarantee, environmental impact, predicted reliability and so on. Conjoint analysis is based on a main effects analysis-of-variance model. Subjects provide data about their preferences for hypothetical products defined by attribute combinations. Conjoint analysis decomposes the judgment data into components, based on qualitative attributes of the products. A numerical part-worth utility value is computed for each level of each attribute. Large part-worth utilities are assigned to the most preferred levels, and small part-worth utilities are assigned to the least preferred levels. The attributes with the largest part-worth utility range are considered the most important in predicting preference. Conjoint analysis is a statistical model with an error term and a loss function.

Survey Analytics is a web based service for conducting online surveys. With Survey Analytics Conjoint module you can collect the data and simulate it through our conjoint simulator. Where in you may ask the respondent to arrange a list of combinatios of product attributes in decreasing order of preference. Once this ranking is obtained, you can use our advance simulator to simulate the data that will give you graphical representatio of your data. This method is efficient in the sense that the survey does not need to be conducted using every possible combination of attributes. The utilities can be determined using a subset of possible attribute combinations. From these results one can predict the desirability of the combinations that were not tested.

Link • https://www.surveyanalytics.com/conjoint-analysis-example.html

Regression Analysis