Textbook Notes (368,430)
Canada (161,877)
Statistics (40)
STATS 2B03 (25)
Chapter 9

chapter 9.docx

5 Pages
Unlock Document

Aaron Childs

Stats 2B03: Statistical Methods for Science Chapter 9: Simple Linear Regression and Correlation 9.2 The Regression Model - Assumptions underlying simple linear regression:  Values of the independent variable X are said to be fixed. This means that the values of X are preselected by the investigator so that in the collection of the data they are not allowed to vary from these preselected values. In this model, X is referred to by some writers as a non-random variable and by others as a mathematical variable. It should be pointed out at this time that the statement of this assumption classifies our model as the classical regression model. Regression analysis also can be carried out on data in which X is a random.  The variable X is measure without error. Since no measuring procedure is perfect, this means that the magnitude of the measurement error in X is negligible  For each value of X there is a subpopulation Y values. For the usual inferential procedures of estimation and hypothesis testing to be valid, these subpopulations must be normally distributed. In order that these procedures may be presented it will be assumed that the Y values are normally distributed.  The variances of the subpopulations of Y are all equal and denoted by σ 2  The means of the subpopulations of Y all lie on the same straight line. This is known as the assumption of linearity. This assumption may be expressed symbolically as , where is the mean of the subpopulation of Y values for a particular value of X, and and are called the population regression coefficients. Geometrically, and represent the y-intercept and slope, respectively, of the line on which all of the means are assumed to lie  The Y values are statistically independent. In other words, in drawing the sample, it is assumed that the values of Y at one value of X in no way depend on the values of Y chosen at another value of X  Regression model:  , the amount by which y deviates from the mean of the subpopulation of Y values from which it is drawn  The ’s for each subpopulation are normally distribute with a variance equal to the common variance of the subpopulations of Y values 9.3 The Sample Regression Equation - The variable designated by Y is sometimes called the response variable and X is sometimes called the predictor variable - Steps in regression analysis:  Assume initially that they are linearly related  Determine whether or not the assumptions underlying a linear relationship are met in the data available for analysis  Obtain the equation for the line that bests fits the sample data  Evaluate the equation to obtain come idea of the strength of the relationship and the usefulness of the equation for predicting and estimating  If the data appear to conform satisfactorily to the linear model, use the equation obtained from the sample data to predict and to estimate - Example:  Scatter plot  The least-squares line:  y = a + bx  Obtaining the least-square line:  - The least-square criterion:  Most of the observed points deviate from the line by varying amounts  The sum of the squared vertical deviations of the observed data points (y)ifrom the least-squares line is smaller than the sum of the squared vertical deviations of the data points from any other line 9.4 Evaluating the Regression Equation - When H : 0 =01is not rejected:  If β1is zero, sample data drawn from the population will yield regression equations that are of little or no value for prediction and estimation purposes  Even though we assume that the relationship between X and Y is linear, it may be that the relationship could be described better by some nonlinear model  Although the relationship between X and Y may be linear it is not strong enough for X to be of much value in predicting and estimating Y, or that the relationship between X and Y is not linear - When H : 0 =01is rejected:  The relationship is linear and of sufficient strength to justify the use of sample regression equations to predict and estimate Y for given values of X; and there is a good fir of the data to a linear model, but some curvilinear model might provide an even better fit - The coefficient of determination:  One way to evaluate the strength of the regression equation is to compare the scatter of the points about the regression line with the scatter about y , the mean of the sample values of Y - The total deviation:  y – y - i - The expla
More Less

Related notes for STATS 2B03

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.