ECO220Y1 Lecture Notes - Lecture 5: Ordinary Least Squares, Analysis Of Variance, Scatter Plot
ECO220
Lecture 5
May 22, 2018
1
Chapter 7: Introduction to Simple Regression
The simplest function is a straight line function.
Ie. We assume
y = f(x) =
as the simplest function relating x and y.
We do not know the values of
We select a random sample of size n. Observations are (x1, y1), (x2, y2), …, (xn, yn).
Sample data are:
i
x
y
1
x1
y1
2
x2
y1
3
…
n
xn
yn
Plot (xi, yi) on a graph
At x = xi, actual y-value is yi; the y-value based on the straight line is i
ECO220
Lecture 5
May 22, 2018
2
ie. At x = xi,
yi = actual y-value
i= estimated y-value
ei = error on y-value = yi - i
For all the pts. below the red line, the errors are negative
For all the pts. above the red line, the errors are positive
Clearly ei can be positive, negative or zero
ei is called the error or residual
It can be
Question: How do we draw the straight line (shown in red in the above picture)?
Intuitively, the best line is the line with the total error equal to a minimum.
However, Total Error =
Therefore, we cannot use this
to minimize error.
Define Sum of squared error = SSE
As SSE =
This straight line that minimizes SSE is the best line to relate x and y together
Method of Least Squares (Method of Ordinary Least Squares, OLS)
SSE = Sum of squared error OR Sum of squared residual
Find b0 and b1, so that SSE is a minimum
Two equations, two unknowns (b0 and b1). We can solve for b0 and b1 given xi, yi from the
sample data. The results are
jimmm and 37654 others unlocked
27
ECO220Y1 Full Course Notes
Verified Note
27 documents
Document Summary
The simplest function is a straight line function. We do not know the values of (cid:1854)(cid:2868) and (cid:1854)(cid:2869) as the simplest function relating x and y. We select a random sample of size n. observations are (x1, y1), (x2, y2), , (xn, yn). 3 n x x1 x2 y y1 y1 xn yn. At x = xi, actual y-value is yi; the y-value based on the straight line is (cid:1877) i. May 22, 2018 ie. at x = xi, yi = actual y-value (cid:1877) i= estimated y-value ei = error on y-value = yi - (cid:1877) i. For all the pts. below the red line, the errors are negative. For all the pts. above the red line, the errors are positive. Clearly ei can be positive, negative or zero ei is called the error or residual. Intuitively, the best line is the line with the total error equal to a minimum. to minimize error.