OMIS 2010 Chapter Notes - Chapter 17: Linear Regression, Interval Estimation, Variance

37 views18 pages
Published on 6 Oct 2011
School
York University
Department
Operations Management and Information System
Course
OMIS 2010
Professor
Page:
of 18
Chapter 17: Simple Linear Regression and Correlation
17.1 Introduction
This chapter dealt with the problem objective of analyzing the relationship between two variables.
You are expected to know the following:
l. Why the model includes the error variable, !.
2. How to calculate the sample regression coefficients and .
ˆ
"#
0ˆ
"#
1
3. How to interpret the coefficients.
4. The four required conditions to perform the statistical inference.
5. How to calculate SSE and s
!
6. How to test and estimate
"
1
7. How to calculate and interpret the coefficient of determination.
8. How to distinguish between and calculate the interval estimate of the expected value of y and the
prediction interval of y.
9. How to calculate and test the Spearman rank coefficient of correlation.
17.2 Model
In this section, we discussed why the model includes the linear part y =
"
0$
"
1x plus the error
variable, !. You should have a clear understanding of how the error variable is measured and of the defi-
nition of the y-intercept
"
0 and the slope
"
1.
17.3 Least Squares Method
We calculate the sample regression coefficients using the following formulas:
2
x
s
Y)cov(X,
b%
1
xbyb 10 &%
Example 17.1
An educational economist wants to establish the relationship between an individual’s income and
education. He takes a random sample of 10 individuals and asks for their income (in $1,000s) and edu-
cation (in years). The results are shown below. Find the least squares regression line.
250
x (education) y (income)
11 25
12 33
11 22
15 41
8 18
10 28
11 32
12 24
17 53
11 26
Solution
Note that we’ve labeled education x and income y because income is affected by education. Our first
step is to calculate the sums, , , xi'xi
2
'yi
'
, yi
2
'
, and xi
'
yi. The sum is not required
in the least squares method but is usually needed for other techniques involved with regression. We find
yi
2
'
118%
'i
x
4501
2,xi%
'
302%
'i
y
07210
2,yi%
'
7793,yx ii %
'
Next, we compute the covariance and the variance of x:
1&
&&
%'
n
)yy)(xx(
)Y,Xcov( ii
(
(
)
*
+
+
,
-&
&
%''
'
n
yx
yx
n
ii
ii
1
1
(
)
*
+
,
-&
&
%10
)302)(118(
779,3
110
1
= 23.93
251
.
/
40.6
10
)118(
450,1
110
1
1
1
1
)(
s
2
2
2
2
2
x%
(
(
)
*
+
+
,
-&
&
%
(
(
(
)
*
+
+
+
,
-
&
&
%
&
&
%'''
n
x
x
nn
xx i
i
i
Therefore,
74.3
40.6
93.23
s
Y)cov(X,
2
x
1%%%b
The sample means are
8.11
10
118 %%% '
n
x
xi
2.30
10
302 %%% '
n
y
yi
We can now compute the y-intercept
b0 93.13)8.11)(74.3(2.30
1
&
%
&%&% xby
Thus, the least squares regression line is
x..y
ˆ7439313 $&%
Example 17.2
Interpret the coefficients of Example 17.1.
Solution
The sample slope b1 = 3.74 tells us that on average for each additional year of education, an indi-
vidual’s income rises by $3.74 thousand. The y-intercept is b = –13.93. It should be obvious that this
value has no meaning. Recall that whenever the range of the observed values of x does not include zero
it is usually pointless to try to interpret the meaning of the y-intercept.
0
252