Study Guides
(238,455)

Canada
(115,142)

University of British Columbia
(3,635)

Economics
(126)

ECON 351
(4)

Stephanie Lluis
(4)

Midterm

# 09 Midterm Solutions 09 Midterm Solutions

Unlock Document

University of British Columbia

Economics

ECON 351

Stephanie Lluis

Winter

Description

University of Waterloo
Department of Economics
ECON 321 – 001 S09
Assignment # 4
Due: July 23, 11:59pm
solutions in brown, grading scheme in blue
1. Use the 321 survey data to investigate study hours for students in the 321 class.
a) Suppose we believe the population model to be:
hrsstudy= yrsschool skipped cheatexam cheatexam∗skipped u
i 0 1 i 2 i 3 i 3 i i
Examine your data set & run this regression (any any other regression you think appropriate), is
multicollinearity (perfect or imperfect) a problem? Explain why not. (2)
Theoretically, there is no reason to suppose perfect multicollinearity in this regression. However, we might expect a
imperfect multicollinearity between skipped and cheat, or on the interaction term. We can investigate whether
multicollinearity is a problem by looking at the correlation between variables and/or by running the full specification
and smaller ones as well.(½ pt)
The command: corr yrsschool skipped cheatexam cheatexam*skipped
Gives the following results:
| yrssch~l skipped cheate~m chtskip
------------- +------------------------------------
yrsschool | 1.0000
skipped | 0.1237 1.0000
cheatexam| 0.0229 0.1029 1.0000
chtskip | -0.0236 0.4354 0.5965 1.0000
So while the correlation moderate with the interaction term, all other correlations are quite low. All well below 0.7.
Running the regression we obtain:
Source | SS df MS Number of obs = 37
-------------+----------------------------F( 4, 32) = 2.09
Model | 21.5060862 4 5.37652155 Prob > F = 0.1048
Residual | 82.2009408 32 2.5687794 R-squared = 0.2074
-------------+----------------------------Adj R-squared = 0.1083
Total | 103.707027 36 2.88075075 Root MSE = 1.6027
------------------------------------------------------------------------------
hrsstudy | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
yrsschool | -.0220612 .0393665 -0.56 0.579 -.1022481 .0581257
skipped | -.2904062 .1548842 -1.87 0.070 -.605895 .0250825
cheatexam | -1.382985 .7067532 -1.96 0.059 -2.822594 .0566241
chtskip | .2791986 .3285231 0.85 0.402 -.3899811 .9483782
_cons | 3.376675 .5307142 6.36 0.000 2.295646 4.457705
------------------------------------------------------------------------------
Which shows us the variables in question are mostly significant (low p-values) except for the interaction term. So
we likely do not have any issues with multicollinearity. We may wish to double check by dropping the interaction
term. If we do so, we find that the coefficients on skipped and cheat exam remain relatively stable, and while the
pvalues rise, there does not seem to be any further indication that multicollinearity is a problem.
(½ pt for discussion of what they find in the data – either corr or regression discussion is fine. 1 pt for interpretation
of these results in terms of whether multicollinearity is an issue.) b) Run the following regression (regardless of your answer to p2rt a) and report the results
(coefficients, standard errors of coefficients, #obs, R , and SSR. (3)
hrsstudy= yrsschool skipped cheatexam cheatexam∗skipped u
i 0 1 i 2 i 3 i 3 i i
Number of obs = 37 (½ pt)
SSR 82.2009408 (½ pt)
R-squared = 0.2074 (½ pt)
Adj R-squared = 0.1083
(1½ pt for below, in any readable format)
------------------------------------------------------------------------------
Variables | Coef.s Std. Er of Coef. t P>|t| [95% Conf. Interval]
------------ -+----------------------------------------------------------------
yrsschool | -.0220612 .0393665 -0.56 0.579 -.1022481 .0581257
skipped | -.2904062 .1548842 -1.87 0.070 -.605895 .0250825
cheatexam | -1.382985 .7067532 -1.96 0.059 -2.822594 .0566241
chtskip | .2791986 .3285231 0.85 0.402 -.3899811 .9483782
_cons | 3.376675 .5307142 6.36 0.000 2.295646 4.457705
------------------------------------------------------------------------------
c) Interpret the coefficient cheatexam*skipped from part b). (2)
The coefficient on cheatexam*skipped tells us how the relative effect of skipping classes on hours studied (the
slope) differs for those who have cheated on exams versus those who have not cheated on exams. (1 pt)
Specifically, those who have cheated on an exam study 1.38 fewer hours per night for each 1 unit increase in the
number of classes skipped per week.(1 pt )
(You could also answer this in terms of the relative effect of cheating on exams on hours studies differs by the
number of classes skipped per week, but this is messier as skipped is non-binary).
d) Suppose that you think skipped, and possibly yrsschool, should have a non-linear relationship
with hrsstudy. Test for functional form misspecification. Describe the test you ran, present the
statistics and interpret the results. (4)
(1 pt for describing test/method (either RESET or DM), 1 pt for presenting results, 2 points for interpretation. Only
one test is required, not both)
You could either run the RESET or the Davidson MacKinnon test. In the latter case, you run the original regression
but with all non-binary x variables in logs instead of levels, save the fitted values and then add the fitted values as a
covariate in the original regression. Note that if the x variables have a significant amount of zeros, these become
missing when they are logged. One option is to not log variables with many zeros. Whatever you do for this
assignment is fine as long as you are clear.
In the RESET test, you would run the original regression, save the fitted values, and regress the original regression
with the addition of squared and cubed fitted values included. The F-test on the joint significance of the fitted values
coefficients would tell you whether or not non-linearities are present in that format. Alternatively you could use the
built in STATA command (which includes a 4 order polynomial in the fitted values), and generates an F-stat and P-
value for you. If you do this, your results are:
Source | SS df MS Number of obs = 37
-------------+-----------------------------F( 3, 33) = 2.57
Model | 19.6507552 3 6.55025174 Prob > F = 0.0708
Residual | 84.0562718 33 2.54715975 R-squared = 0.1895
-------------+-----------------------------Adj R-squared = 0.1158
Total | 103.707027 36 2.88075075 Root MSE = 1.596 -----

More
Less
Related notes for ECON 351