MATH 2412 Lecture Notes - Lecture 10: Scatter Plot, Decision Rule, Null Hypothesis

28 views13 pages
5 Jul 2018
School
Department
Course
Professor
1
Linear regression with one regressor:
In Week 4, we focused on a linear regression with one regressor, which is often referred
to as a simple linear regression model.
Specification of a linear regression with one regressor:
This regression is specified as follows:
iii uxy 10
EE
i=1,2,…,n
i denotes the i-th observation
n is the total number of observations
x is the sole independent variable (regressor)
y is the dependent variable (regressand)
u is the error term (captures variables other than x that are omitted from the model)
The above model assumes direction of causality is from x (the regressor) to y (the
regressand). The model is referred to as a regression of y (the regressand) on x (the
regressor).
The model also defines the population regression line (because it applies to the
population), in which case
0
E
and
1
E
are the parameters of the population regression line.
Recall that a parameter is a numerical quantity which is computed for the entire
population and the value of the parameter is always fixed, be it known or unknown.
0
E
is the intercept parameter (i.e. the value of the dependent variable y when the
independent variable x is equal to zero).
1
E
is the slope parameter (indicates the effect of increasing x by one unit on the
dependent variable);
0
1
E
indicates no relationship between x and y;
1
E
is positive
indicates a positive relationship; and
1
E
is negative indicates a negative relationship.
The presence of only one regressor in a simple linear regression models may lead to
omitted variable bias, i.e. the bias in the estimation of the intercept and slope parameters
as a result of the exclusion of other potentially important determinants of y from the
regression equation.
The method of ordinary least squares (OLS)
OLS involves determining the values of
0
ˆ
E
and
1
ˆ
E
that minimizes the sum of squared
residuals.
0
ˆ
E
and
1
ˆ
E
are used to set up the sample regression line since sample data are
used to evaluate
0
ˆ
E
and
1
ˆ
E
. Hence,
0
ˆ
E
and
1
ˆ
E
define the sample statistics whose values
may vary from sample to sample (i.e. they are random variables the probability
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 13 pages and 3 million more documents.

Already have an account? Log in
2
distributions of which are called sampling distributions; and they have standard errors
too!!).
If we solve the problem of minimizing the sum of squared residuals for the model
iii uxy 10
EE
using calculus we get:
 
 
¦
¦
n
i
i
i
n
i
i
xx
yyxx
1
2
1
1
ˆ
E
, the OLS estimator for
1
E
(the slope)
It is also appropriate to write
(the (n-1) cancels out
anyway!) in which case the numerator becomes the covariance between x and y and the
denominator becomes the variance of x. Hence,
1
ˆ
E
can also be obtained by dividing the
covariance between x and y by the variance of x.
xy 10 ˆ
EE
the OLS estimator of
0
E
(the intercept).
Predicted values
Regression models are more attractive than the correlation coefficient or the covariance
for purposes of studying relationships owing to the ability to predict the mean value of
y
given
x
.
For the model
iii uxy 10
EE
the predicted value of
i
y
conditional on the value of
i
x
is given by:
ii xy 10 ˆˆ
ˆ
EE
One important property of predicted values:
1. The mean of predicted values of dependent variable is always equal to mean of actual
values of the dependent variable when an intercept parameter
0
E
is included in the model
i.e.
n
y
n
y
n
i
i
n
i
i¦¦ 11
ˆ
Residuals
Recall that ordinary least squares involves minimizing the sum of squared residuals.
A residual is obtained by subtracting the predicted value of y (which is read from the
estimated regression line) from the actual value of y (which is read from the actual point
on the scatter diagram), i.e.
iii yyu ˆˆ
A residual could be positive (if the point on the scatter diagram is above the regression
line) or negative (if the point on the scatter diagram lies below the regression line). To
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 13 pages and 3 million more documents.

Already have an account? Log in
3
ensure that negative residuals do not cancel out positive residuals, it makes sense to
minimize the sum of squared residuals, which ordinary least squares entails.
Two other important properties of residuals
1. The mean of residuals is always equal to zero when an intercept
0
E
is included in the
model i.e.
0
ˆ
1
¦
n
u
n
i
i
2. If we multiply each residual by the corresponding value of the regressor x and add
them up we always get a sum of zero when an intercept
0
E
is included in the model i.e.
0
ˆ
1
¦
n
xu
n
i
ii
The three sums of squares
The so-called three sums of squares (i.e., Total sum of squares (TSS), explained sum of
squares (ESS) and sum of squared residuals (SSR)) are important for both hypothesis
testing and assessing goodness-of-fit of the model
1. Total sum of squares (TSS)=
 
2
1
2¦¦
meanactualyy
n
i
i
2. Explained sum of squares (ESS)=
 
2
1
2
ˆ¦¦
meanpredictedyy
n
i
i
Note: Mean of predicted y = mean of actual y; Explained sum of squares is also called
Regression sum of squares
3. Sum of squared residuals (SSR) =
 
2
1
2
ˆ¦¦
predictedactualyy
n
i
ii
Note: Explained sum of squares is also called Error sum of squares
The three sums of squares are related as follows:
TSS=ESS+SSR
Presentation of the three sums of squares
An Analysis of Variance (ANOVA) table is used:
Format of ANOVA Table (Simple Linear Regression)
Source of variation degrees of freedom Sum of Squares Mean Square F-ratio
Explained/Regression 1 ESS ESS/1
)2/(
1/
nSSR
ESS
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 13 pages and 3 million more documents.

Already have an account? Log in

Document Summary

In week 4, we focused on a linear regression with one regressor, which is often referred to as a simple linear regression model. Specification of a linear regression with one regressor: This regression is specified as follows: y i=1,2, ,n. The above model assumes direction of causality is from x (the regressor) to y (the regressand). The model is referred to as a regression of y (the regressand) on x (the regressor). The model also defines the population regression line (because it applies to the population), in which case. 1(cid:69) are the parameters of the population regression line. Recall that a parameter is a numerical quantity which is computed for the entire population and the value of the parameter is always fixed, be it known or unknown. 0(cid:69) is the intercept parameter (i. e. the value of the dependent variable y when the independent variable x is equal to zero).

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related textbook solutions

Related Documents