STAT1008 Study Guide - Final Guide: Dependent And Independent Variables, Scatter Plot

47 views2 pages
17 May 2018
School
Department
Course
Professor
2.6 Two Quantitative Variables: Linear Regression
The Regression Line
The regression line provides a model of a linear association between two variables, and we
can use the regression line on a scatterplot to give a predicted variable of the response
variable, based on a given value of the explanatory variable.
The estimated regression line is .
       
Unlike correlation, for linear regression it does matter which is the explanatory and which is
the response variable.
Note: The regression line to predict y from x is not the same as the regression line to predict
x from y.
Predicted and Actual Values
The observed response value, y, is the response value observed for a particular data point.
The predicted response variable, , is the response variable that would be predicted for a
given x value, based on a model.
The best fitting line is what makes the predicted value closest to the actual values.
Residuals
The residual for each data point = observed - predicted = y 
The residual is also the vertical distance from each point to the line.
Points above the line will have positive residuals and points below will have negative.
Least Squares Line
The least squares line is the line which minimises the sum of squared residuals.
Minimise   
i
Rely on technology for finding the least square line.
"least squares line" = "regression line"
If we add up all the residuals from the regression line, the sum will always be zero.
Slope and Intercept
For the estimated regression line is    a is the intercept and b is the slope.
Slope: increase in predicted y for every unit increase in x.
Intercept: predicted y value when x = 0.
Regression Cautions
1. Do not use the regression equation or line to predict outside the range of x values available
in your data.
o If none of the x values are anywhere near 0, then the intercept is meaningless.
o It is helpful to think about units when interpreting a regression equation.
o
2. Although the regression line can be calculated for any set of paired quantitative variables, it
is only appropriate to use a regression line when there is a linear trend in the data.
3. Outliers have a strong influence on the regression line.
o Data points for which the explanatory value is an outlier are often called influential
points.
4. Higher values of x may lead to higher (or lower) predicted values of y, but this does not
mean that changing x will cause y to increase or decrease.
o Causation can only be determined randomly (which is rarely the case for a
continuous explanatory variable).
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers

Related textbook solutions

Related Documents

Related Questions