# STAT 100 Chapter 15: Chapter 15.docx

34 views2 pages
3 Apr 2015
School
Department
Course
Professor

Chapter 15 – Describing Relationships: Regression, Prediction, and Causation
Regression line – straight line that describes how a response variable y changes as explanatory
variable x changes
oOften used to predict value of y for a given value of x
Want to draw a line that is close to the points in the vertical (y) direction
oNeed to find equation of the line that comes closest to the points in the vertical direction
Least-squares regression of y on x – line that makes the sum of the squares of the vertical
distances of the data points from the line as small as possible
oLook at vertical distances of points from the regression line, square them, and move the
line until the sum of the squares is the smallest it can be for any line
y = a + bx
ox  explanatory variable, y  response variable
ob  slope of the line (amount by which y changes when x increases by one unit)
oa  intercept (value of y when x=0)
computer makes prediction easy and automatic  anything done automatically often done
thoughtlessly
ocomputer cannot decide which is explanatory variable and which is response variable 
two different lines depending on which is explanatory
we often use several explanatory variables to predict response
statistical methods of predicting response all share some basic properties of least-squares
regression lines
oPrediction is based on fitting some “model” to a set of data
oPrediction works best when the model fits the data closely
if they do not have strong patterns, prediction may be very inaccurate
oPrediction outside the range of the available data is risky
Referred to as extrapolation
Correlation – measures direction and strength of straight-line relationship; regression – draws a
line to describe the relationship
oClosely connected even though regression requires choosing explanatory variable and
correlation does not
oBoth are strongly affected by outliers
Usefulness of regression line for prediction depends on correlation between variables
oSquare of the correlation, r2 – proportion of variation in the values of y that is explained
by the least-squares regression of y on x
owhen there is a straight-line relationship, some of variation in y is accounted for by fact
that as x changes it pulls y along with it
ouseful to give r2 as measure of how successful the regression was in explaining the
response
operfect correlation (r = 1 or -1) means points lie exactly on the line
Statistics and causation
o1. A strong relationship between two variables does not always mean that changes in one
variable causes changes in the other
o2. The relationship between two variables is often influenced by other variables lurking in
the background
o3. The best evidence for causation comes from randomized comparative experiments
“nonsense correlations”  correlations that lead to conclusions that changing one of the variables
causes changes in another
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

# Get access

\$10 USD/m
Billed \$120 USD annually
Homework Help
Class Notes
Textbook Notes