Chapter 17: Simple Linear Regression and Correlation
This chapter dealt with the problem objective of analyzing the relationship between two variables.
You are expected to know the following:
l. Why the model includes the error variable, .
2. How to calculate the sample regression coefficients 0 and 1.
3. How to interpret the coefficients.
4. The four required conditions to perform the statistical inference.
5. How to calculate SSE and s
6. How to test and estimate 1
7. How to calculate and interpret the coefficient of determination.
8. How to distinguish between and calculate the interval estimate of the expected value of y and the
prediction interval of y.
9. How to calculate and test the Spearman rank coefficient of correlation.
In this section, we discussed why the model includes the linear part y = 0 x 1lus the error
variable, . You should have a clear understanding of how the error variable is measured and of the defi-
nition of the y-intercept0 and the slope 1 .
17.3 Least Squares Method
We calculate the sample regression coefficients using the following formulas:
b0 y b 1
An educational economist wants to establish the relationship between an individuals income and
education. He takes a random sample of 10 individuals and asks for their income (in $1,000s) and edu-
cation (in years). The results are shown below. Find the least squares regression line.
250 x (education) y (income)
Note that weve labeled education x and income y because income is affected by education. Our first
step is to calculate thesumi, i2, yi, i2, and xi i The sum yi is not required
in the least squares method but is usually needed for other techniques involved with regression. We find
x y 3,779
Next, we compute the covariance and the variance of x:
( i x )i y y )
cov( X ,Y )
1 xi yi
1 3,779 (118)(302
(xi x)2 xi 2
sx 1 xi 1 1,450 (118) 6.40
n 1 n 1 n 10 1 10
The sample means are
x i 118
We can now compute the y-intercept
b y b1x 30.2 (3.74)(11.8 13.93
Thus, the least squares regression line is
y 13.93 3.74x
Interpret the coefficients of Example 17.1.
The sample slope b = 3.74 tells us that on average for each additional year of education, an indi-
viduals income rises by $3.74 thousand. The y-intercept i= 13.93. It should be obvious that this
value has no meaning. Recall that whenever the range of the observed values of x does not include zero
it is usually pointless to try to interpret the meaning of the y-intercept.
252 Question: How do I know which is the dependent variable and which is the independent
Answer: The dependent variable is the one that we want to forecast or analyze. The in-
dependent variable is hypothesized to affect the dependent variable. In Ex-
ample 21.1, we wish to analyze income, and we choose as the variable that
most affects income the individuals education. Hence, we label income y and
17.1 Fifteen observations were taken to estimate a simple regression model. The following summa-
tions were produced:
x 50 x2 250 y 100 y2 1,100 xy 500
Find the least squares regression line.
17.2 The manager of a large furniture store wanted to determine the effectiveness of her advertising.
The furniture store regularly runs several ads per month in the local newspaper. The manager
wanted to know if the number of ads influenced the number of customers. During the past eight