Class Notes (836,562)
Canada (509,854)
MGSC 372 (10)
Lecture 10

MGSC 372 Lecture 10: Fall 2016 Semester Notes 10

14 Pages
Unlock Document

Management Science
MGSC 372
Brian Smith

The terms are more spread out on the right side of the chart when compared to the left hand side of the chart: heteroscedasticity. Stabilizing Transformations Possible transformation to help eliminate or reduce heteroscedasticity are: √ 𝑦 or ln(𝑦) Comment on Transformation of Data There are many possible transformation of the form π‘Œ where π‘˜ can be any real exponent. Examples 1 1 1 1 2 𝑦 = βˆšπ‘¦;𝑦 ;𝑦 βˆ’2 = ;π‘¦βˆ’1 = ;𝑦 ;𝑦 βˆ’3;𝑒𝑑𝑐. 𝑦2 𝑦 Example Salary vs. Years of Experience for a random sample of 50 social workers. Exp (X) Salary (Y) 7 26075 28 79370 23 65726 18 41983 19 62308 15 41154 . . Second Order Model Minitab Output: Second Order Model Residual Plot: Evidence of Heteroscedasticity Summary of Second Order Model Regression equation: SALARY = 20242 + 522EXP + 53.0EXPSQ 2 𝑅 = 81.6% But the residual plot shows signs of heteroscedasticity. Second Order Model with Logarithmic Transformation Minitab Output: Model Including Logarithmic Transformation and Quadratic Term Residual Plot: No Evidence of Heteroscedasticity Summary of Second Order Model ln(Salary) vs. EXP and EXPSQ Regression equation: ln(SALARY) = 9.84 + 0.0497EXP + 0.000009EXPSQ 2 𝑅 = 86.4% π‘…π‘Ž= 85.8% The residual plot shows that the log transformation has significantly reduce heteroscedasticity. But the coefficient of EXPSQ is not significant (p-value = 0.98). First Order Model with Logarithmic Transformation Minitab Output Note: Same 𝑅 value and higher 𝑅 value. π‘Ž Interpreting the Model ln Μ‚ = 9.84 + 0.05π‘₯ Μ‚ = 𝑒 9.84+0.05= 𝑒 9.8𝑒 0.05= 18769.72𝑒 0.05π‘₯ Experience Predicted Salary 0 $18769.72 5 $24100.80 10 $30946.04 15 $39735.50 20 $51021.39 A Test for Heteroscedasticity Divide the sample observations based on the values oΜ‚ or equivalently, in this example, the value of π‘₯ (since for the fitted modΜ‚ increases as π‘₯ increases). Examination of the data shows that approximately one-half of the 50 observations fall below π‘₯ = 20. Testing for Equal Variances We next calculate the variances of the observations in subgroups 1 and 2 and perform a test of hypothesis for the ratio of the variances. 𝜎 2 𝐻0: 1 = 1 𝜎 2 2 𝐻 :𝜎1 β‰  1 1 𝜎 2 Subgroups 1 and 2 Calculatingπ’”πŸand π’”πŸfor SAL vs. EXP, EXPSQ 2 𝑀𝑆𝐸 1 𝑠 1 2 𝑀𝑆𝐸 2 𝑠 2 Testing the Hypothesis of Equal Variances 𝜎 π‘™π‘Žπ‘Ÿπ‘”π‘’π‘Ÿ 𝐻0: = 1 πœŽπ‘ mπ‘Žπ‘™π‘™π‘’π‘Ÿ 2 𝜎 π‘™π‘Žπ‘Ÿπ‘”π‘’π‘Ÿ 𝐻1: 2 β‰  1 πœŽπ‘ π‘šπ‘Žπ‘™π‘™π‘’π‘Ÿ 2 𝑇𝑆:𝐹 = π‘ π‘™π‘Žπ‘Ÿπ‘”π‘’π‘Ÿ = 𝑀𝑆𝐸 π‘™π‘Žπ‘Ÿπ‘”π‘’π‘Ÿ= 94711023 = 2.99937 𝑠2 𝑀𝑆𝐸 π‘ π‘šπ‘Žπ‘™π‘™π‘’π‘Ÿ 31576998 π‘ π‘šπ‘Žπ‘™π‘™π‘’π‘Ÿ 𝐢𝑉:𝐹 0.025,23,212.37 Accept 𝐻0if 𝐹 ≀ 2.37 Reject𝐻 0f 𝐹 < 2.37 Since 𝐹 = 2.99937 is greater than 2.37 we reject the hypothesis of equal variances. Therefore, the quadratic model has residuals that exhibit heteroscedasticity. Checking the Normality Assumption Important note Moderate departures from the normality assumption will generally not invalidate the results of a regression analysis. We can say that regression analysis is robust with regard to the normality assumption. If a graphical display of the data (stem-and-leaf plot, histogram, etc.) is not badly skewed, and has one major central peak, we can be confident in using the model. Checking for Normality We will use the model: ln π‘†π‘Žπ‘™π‘Žπ‘Ÿπ‘¦ = 9.84 + 0.05𝐸π‘₯𝑝 The histogram of the residuals of the residuals shows that the distribution is mound-shaped and reasonable symmetric. Therefore, we suspect that the normality assumption is satisfied. However, this claim is subjective so we need a more formal approach. Normal Probability Plot The normal probability plot graphs the residuals against the expected values of the residuals under the assumption of normality. If the assumption of normality is true then a residual value should approximately equal its expected value, resulting in a straight line graph. The Anderson-DarlingStatistic The AD statistic is used to test the hypothesis. 𝐻 : Distribution is normal 0 𝐻 1 Distribution is not normal If the p-value for the AD is β‰₯ 0.05, there is no reason to conclude that the distribution is not approximately normal. Conclusion
More Less

Related notes for MGSC 372

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.