410 likes | 814 Views
Multiple Regression – Basic Relationships. Purpose of multiple regression Different types of multiple regression Standard multiple regression Steps in solving standard multiple regression problems. Purpose of multiple regression.
E N D
Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression Standard multiple regression Steps in solving standard multiple regression problems
Purpose of multiple regression • The purpose of multiple regression is to analyze the relationship between metric or dichotomous independent variables and a metric dependent variable. • If there is a relationship, using the information in the independent variables will improve our accuracy in predicting values for the dependent variable.
Types of multiple regression • There are three types of multiple regression, each of which is designed to answer a different question: • Standard multiple regression is used to evaluate the relationships between a set of independent variables and a dependent variable. • Hierarchical, or sequential, regression is used to examine the relationships between a set of independent variables and a dependent variable, after controlling for the effects of some other independent variables on the dependent variable. • Stepwise, or statistical, regression is used to identify the subset of independent variables that has the strongest relationship to a dependent variable.
Standard multiple regression - 1 • In standard multiple regression, all of the independent variables are entered into the regression equation at the same time. • The minimum expectation for multiple regression is that there is a statistically significant relationship between the set of independent variable and the dependent variable. An F test is used to determine if the relationship can be generalized to the population represented by the sample. • Multiple R and R² measure the strength of the relationship between the set of independent variables and the dependent variable.
Standard multiple regression - 2 • If there is an overall relationship between the set of independent variables and the dependent variable, we interpret the individual relationships of the independent variables. • A t-test is used to evaluate the individual relationship between each independent variable and the dependent variable. • If the relationship is statistically significant, its impact on the dependent variable is stated as higher (lower) scores on the independent variable are associated with (higher) lower scores on the dependent variable.
Standard multiple regression - 3 • If there is an overall relationship between the set of independent variables and the dependent variable, we can answer the question of which of the statistically significant predictors has the largest influence on the dependent variable, makes the largest difference in the value of the dependent variable. • The b coefficients represent the change in the dependent variable for a one-unit change in the independent variable. But, we cannot compare the b coefficients because they are scaled in different units. • However, the beta coefficients are standardized for comparison. The variable with the largest value for beta (positive or negative) has the largest influence on the value of the dependent variable.
Plan for regression assignments • In this class, we will focus on the basic evaluation of relationships in standard multiple regression. • In the next class, we will include the evaluation of assumptions and outliers, and validation analysis to produce a more complete standard multiple regression solution. • In the following class, we will look at alternate methods for including variables in multiple regression: hierarchical multiple regression and stepwise multiple regression.
Question 1 To answer the first question, we examine the level of measurement for each variable listed in the problem. Multiple regression requires that the dependent variable be metric and the independent variables be metric or dichotomous.
Answer 1 "Frequency of attendance at religious services" [attend] is ordinal, satisfying the metric level of measurement requirement for the dependent variable, if we follow the convention of treating ordinal level variables as metric. Since some data analysts do not agree with this convention, a note of caution should be included in our interpretation. "Strength of religious affiliation" [reliten] and "frequency of prayer" [pray] are ordinal, satisfying the metric or dichotomous level of measurement requirement for independent variables, if we follow the convention of treating ordinal level variables as metric. Since some data analysts do not agree with this convention, a note of caution should be included in our interpretation. True with caution is the correct answer.
Question 2 Having satisfied the level of measurement requirements, we turn our attention to the sample size requirements. To answer this question, and those after it, we need to compute the standard multiple regression in SPSS.
Request a standard multiple regression To compute a multiple regression in SPSS, select the Regression | Linear command from the Analyze menu.
Specify the variables and selection method First, move the dependent variable attend to the Dependent text box. Second, move the independent variables reliten and pray to the Independent(s) list box. Third, select the method for entering the variables into the analysis from the drop down Method menu. In this example, we accept the default of Enter for direct entry of all variables, which produces a standard multiple regression. Fourth, click on the Statistics… button to specify the statistics options that we want.
Specify the statistics output options First, mark the checkboxes for Estimates on the Regression Coefficients panel. Third, click on the Continue button to close the dialog box. Second, mark the checkboxes for Model Fit and Descriptives.
Request the regression output Click on the OK button to request the regression output.
Answer 2 In the Descriptive Statistics table in the SPSS output, we see the number of cases with valid data for all of the variables included in our analysis. With 2 independent variables, we satisfy both the minimum and the preferred sample size requirement.
Question 3 In order for the finding about overall relationship to be true, it must satisfy two conditions. First, the F test for the regression must be statistically significant at the stated alpha level. Second, the strength of the relationship must be correctly stated. If the relationship is true, but involves ordinal variables, a caution is added.
Overall Relationship Between Independent Variables and the Dependent Variable - 1 The probability of the F statistic (49.824) for the overall regression relationship is <0.001, less than or equal to the level of significance of 0.05. We reject the null hypothesis that there is no relationship between the set of independent variables and the dependent variable (R² = 0). We support the research hypothesis that there is a statistically significant relationship between the set of independent variables and the dependent variable.
Overall Relationship Between Independent Variables and the Dependent Variable - 2 The Multiple R for the relationship between the set of independent variables and the dependent variable is 0.689, which would be characterized as strong using the rule of thumb that a correlation less than or equal to 0.20 is characterized as very weak; greater than 0.20 and less than or equal to 0.40 is weak; greater than 0.40 and less than or equal to 0.60 is moderate; greater than 0.60 and less than or equal to 0.80 is strong; and greater than 0.80 is very strong.
Answer 3 We satisfied both conditions: the F test for the regression was statistically significant and the strength of the relationship was correctly identified. A caution results from the inclusion of ordinal variables.
Question 4 In order for findings about individual relationships to be true, they must satisfy two conditions. First, the t test for the b coefficient must be statistically significant at the stated alpha level. Second, the statement of the relationship must be correct. If the relationship is true, but involves ordinal variables, a caution is added.
Relationship of Individual Independent Variable to Dependent Variable - 1 Based on the statistical test of the b coefficient (t = 5.857, p<0.001) for the independent variable "strength of religious affiliation" [reliten], the null hypothesis that the slope or b coefficient was equal to 0 was rejected. The research hypothesis that there was a relationship between strength of religious affiliation and frequency of attendance at religious services was supported.
Relationship of Individual Independent Variable to Dependent Variable - 2 To check whether the statement of the relationship is correct or not, we need to understand the pattern of the coding for the variable when it is ordinal level of measurement. Higher numeric values for strength of religious affiliation meant that survey respondents have been more strongly affiliated with their religion.
Relationship of Individual Independent Variable to Dependent Variable - 3 Higher numeric values for frequency of attendance at religious services meant that survey respondents have attended religious services more often.
Relationship of Individual Independent Variable to Dependent Variable - 4 The positive sign of the b coefficient (1.138) meant the relationship between the numeric values for strength of religious affiliation and frequency of attendance at religious services was a direct relationship, implying that higher numeric values for the independent variable (strength of religious affiliation) were associated with higher numeric values for the dependent variable (frequency of attendance at religious services). The correct statement in the relationship is: "survey respondents who have been more strongly affiliated with their religion have attended religious services more often".
Answer 4 While the hypothesis test supports the existence of a relationship, the statement of the relationship in the problem is opposite to the correct statement, so the answer to the question is false.
Question 5 The next question asks us to evaluate the relationship for the second independent variable. There will be a separate question for each of the independent variables.
Relationship of Individual Independent Variable to Dependent Variable - 1 Based on the statistical test of the b coefficient (t = 4.145, p<0.001) for the independent variable "frequency of prayer" [pray], the null hypothesis that the slope or b coefficient was equal to 0 was rejected. The research hypothesis that there was a relationship between frequency of prayer and frequency of attendance at religious services was supported.
Relationship of Individual Independent Variable to Dependent Variable - 2 To check whether the statement of the relationship is correct or not, we need to understand the pattern of the coding for the variable when it is ordinal level of measurement. Higher numeric values for frequency of prayer meant that survey respondents have prayed more often.
Relationship of Individual Independent Variable to Dependent Variable - 3 The positive sign of the b coefficient (0.554) meant the relationship between frequency of prayer and frequency of attendance at religious services was a direct relationship, implying that higher numeric values for the independent variable (frequency of prayer) were associated with higher numeric values for the dependent variable (frequency of attendance at religious services). The correct statement in the relationship is: "survey respondents who have prayed more often have attended religious services more often".
Answer 5 The hypothesis test supports the existence of the relationship, the statement of the relationship in the problem is a correct statement, so the answer to the question is true. A caution results from the inclusion of ordinal variables.
Question 6 The next question asks us to identify which predictor has the largest effect on the dependent variable. The largest effect is operationally defined as the largest change in the dependent variable associated with a one-unit change in the independent variables.
Independent Variable with Largest Effect on the Dependent Variable - 1 To answer this question, we look for the largest value in the column of standardized beta coefficients, irrespective of sign. In this example, the beta coefficient of 0.465 for strength of affiliation is larger than the beta coefficient of 0.329 for how often the respondent prays.
Answer 6 The answer to the question is true because the correct variable was identified as having the largest influence on the dependent variable. A caution results from the inclusion of ordinal variables.
Steps in answering questions about standard multiple regression - 1 Question: Variables included in the analysis satisfy the level of measurement requirements? Is the dependent variable metric and the independent variables metric or dichotomous? Incorrect application of a statistic No Yes
No No Yes Yes Yes Standard multiple regression - 2 Question: Number of variables and cases satisfy sample size requirements? Compute the standard multiple regression in SPSS Ratio of cases to independent variables at least 5 to 1? Inappropriate application of a statistic Ratio of cases to independent variables at preferred sample size of at least 15 to 1? True with caution True
No No Yes Ordinal variables included in the relationship? True with caution Yes No True Standard multiple regression - 3 Question: Finding about overall relationship between dependent variable and independent variables. Probability of F test of regression less than/equal to level of significance? False Strength of relationship for included variables interpreted correctly? False Yes
No Yes No Ordinal variables included in the relationship? True with caution Yes No Yes Yes Yes True Standard multiple regression - 4 Question: Finding about individual relationship between independent variable and dependent variable. Probability of t test between each IV and DV <= level of significance? False Direction of relationship between IV and DV interpreted correctly? False
No Yes Ordinal variables included in the relationship? True with caution No Yes True Standard multiple regression - 5 Question: Finding about independent variable with largest impact on dependent variable. Does the stated variable have the largest beta coefficient (ignoring sign)? False