STA215H5 Lecture Notes - Lecture 8: Confounding, Dependent And Independent Variables, Bee Sting

20 views3 pages
6 Jun 2018
School
Department
Course
Professor
STA215; Chapter 8
Recall; Residuals
A residual is the difference between an observed value of the response variable and the
value predicted by the regression line. That is, a residual is the prediction error that
remains after we have chosen the regression line.
Residual = observed y - predicted y. We will write this as: residual = y − y
By hand;
To find the residuals by hand, you will need to find the prediction, ˆy, for each
observation, y. Then subtract the two as such: y − yˆ
NOTE: The correlation between the residuals and x is 0 (up to round off error).
This is a special property of the least squares regression.
Influential Observations
An observation is influential for a statistical calculation if removing it would significantly
change the result of the calculation.
The result of a statistical calculation may be of little practical use if it depends strongly on
a few influential observations.
Points that are outliers in either the x or the y direction of a scatterplot are often
influential for the correlation. Points that are outliers in the x direction are often influential
for the least-squares regression line.
Example;
a) Make a scatterplot of the data that is suitable for predicting metabolic rate from
body mass, with two new points added. Point A: mass 42 kilograms, metabolic
rate 1500 calories. Point B: mass 70 kilograms, metabolic rate 1400 calories.
b) In which direction is each of these points an outlier?
Point A lies above the other points; that is, the metabolic rate is higher
than we expect for the given body mass. Point B lies to the right of the
other points; that is, it is an outlier in the x (mass) direction, and the
metabolic rate is lower than we would expect
c) Add three least-squares regression lines to your plot: one for the original 12
women, one for the original women plus Point A, and one for the original women
plus Point B.
d) Which new point is more influential for the regression line? Explain why
Unlock document

This preview shows page 1 of the document.
Unlock all 3 pages and 3 million more documents.

Already have an account? Log in

Document Summary

A residual is the difference between an observed value of the response variable and the value predicted by the regression line. That is, a residual is the prediction error that remains after we have chosen the regression line. Residual = observed y - predicted y. We will write this as: residual = y y. To find the residuals by hand, you will need to find the prediction, y, for each observation, y. Then subtract the two as such: y y . Note: the correlation between the residuals and x is 0 (up to round off error). This is a special property of the least squares regression. An observation is influential for a statistical calculation if removing it would significantly change the result of the calculation. The result of a statistical calculation may be of little practical use if it depends strongly on a few influential observations.

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents