false

Textbook Notes
(368,430)

Canada
(161,877)

McMaster University
(8,634)

Statistics
(40)

STATS 2B03
(25)

Aaron Childs
(25)

Chapter 9

Unlock Document

Statistics

STATS 2B03

Aaron Childs

Fall

Description

Stats 2B03: Statistical Methods for Science
Chapter 9: Simple Linear Regression and Correlation
9.2 The Regression Model
- Assumptions underlying simple linear regression:
Values of the independent variable X are said to be fixed. This means
that the values of X are preselected by the investigator so that in the
collection of the data they are not allowed to vary from these
preselected values. In this model, X is referred to by some writers as a
non-random variable and by others as a mathematical variable. It
should be pointed out at this time that the statement of this
assumption classifies our model as the classical regression model.
Regression analysis also can be carried out on data in which X is a
random.
The variable X is measure without error. Since no measuring
procedure is perfect, this means that the magnitude of the
measurement error in X is negligible
For each value of X there is a subpopulation Y values. For the usual
inferential procedures of estimation and hypothesis testing to be
valid, these subpopulations must be normally distributed. In order
that these procedures may be presented it will be assumed that the Y
values are normally distributed.
The variances of the subpopulations of Y are all equal and denoted by
σ 2
The means of the subpopulations of Y all lie on the same straight line.
This is known as the assumption of linearity. This assumption may be
expressed symbolically as , where is the mean of
the subpopulation of Y values for a particular value of X, and and
are called the population regression coefficients. Geometrically,
and represent the y-intercept and slope, respectively, of the line on
which all of the means are assumed to lie
The Y values are statistically independent. In other words, in drawing
the sample, it is assumed that the values of Y at one value of X in no
way depend on the values of Y chosen at another value of X
Regression model:
, the amount by which y deviates from the mean of the
subpopulation of Y values from which it is drawn
The ’s for each subpopulation are normally distribute with a variance
equal to the common variance of the subpopulations of Y values
9.3 The Sample Regression Equation
- The variable designated by Y is sometimes called the response variable and X
is sometimes called the predictor variable
- Steps in regression analysis:
Assume initially that they are linearly related Determine whether or not the assumptions underlying a linear
relationship are met in the data available for analysis
Obtain the equation for the line that bests fits the sample data
Evaluate the equation to obtain come idea of the strength of the
relationship and the usefulness of the equation for predicting and
estimating
If the data appear to conform satisfactorily to the linear model, use the
equation obtained from the sample data to predict and to estimate
- Example:
Scatter plot
The least-squares line:
y = a + bx
Obtaining the least-square line:
- The least-square criterion:
Most of the observed points deviate from the line by varying amounts
The sum of the squared vertical deviations of the observed data points
(y)ifrom the least-squares line is smaller than the sum of the squared
vertical deviations of the data points from any other line
9.4 Evaluating the Regression Equation
- When H : 0 =01is not rejected:
If β1is zero, sample data drawn from the population will yield
regression equations that are of little or no value for prediction and
estimation purposes
Even though we assume that the relationship between X and Y is
linear, it may be that the relationship could be described better by
some nonlinear model
Although the relationship between X and Y may be linear it is not
strong enough for X to be of much value in predicting and estimating
Y, or that the relationship between X and Y is not linear
- When H : 0 =01is rejected:
The relationship is linear and of sufficient strength to justify the use of
sample regression equations to predict and estimate Y for given
values of X; and there is a good fir of the data to a linear model, but
some curvilinear model might provide an even better fit
- The coefficient of determination:
One way to evaluate the strength of the regression equation is to
compare the scatter of the points about the regression line with the
scatter about y , the mean of the sample values of Y
- The total deviation:
y – y -
i
- The expla

More
Less
Related notes for STATS 2B03

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.