Textbook Notes (363,065)
Psychology (2,948)
PSY201H1 (45)
Chapter

# Chapter Seven PSY201

14 Pages
77 Views

School
University of Toronto St. George
Department
Psychology
Course
PSY201H1
Professor
Kristie Dukewich
Semester
Fall

Description
Chapter 7: Linear Regression – regression and correlation are closely related + both involve the relationship between two variables + both utilize the same set of basic data: paired scores taken from the same/matched subjects – correlation concerned with magnitude and direction of the relationships + regression focuses on using the relationship for prediction – prediction easy when relationship is perfect + if perfect, all the points fall on a straight line and all we need to do is derive the equation of the straight line and use it for prediction + perfect relationship = perfect prediction – all predicted values are exactly equal to the observed values and prediction error equals zero + situation more complicated when relationship is imperfect – regression: a topic that considers using the relationship between two/more variables for prediction – regression line: a best fitting line used for prediction Prediction and Imperfect Relationships: – ie: in a given scatter plot of data, + relationship is imperfect, positive, and linear + problem for prediction: how to determine the single straight line that best describes the data + solution most often used is to construct the line that minimizes errors of prediction according to a least-squares criterion → line is called least-squares regression line. • least-squares regression line shown in data • vertical distance between each point and line represents error in prediction • if let Y' = the predicted Y value and Y = the actual value → Y – Y' = error for each point • the total error in prediction doesn't equal Σ(Y-Y') because some of Y' values will be greater than Y and some will be less → there'll be both positive and negative error scores, and simple algebraic sums of these would cancel each other • similar situation when considering measures of average disperson + in deriving equation for standard deviation, squared X-X(line) to overcome the fact that there were positive and negative deviation scores that cancel each other + same solution works too... → • instead of just summing Y-Y', first computer (Y-Y')^2 for each score → removes the negative values and eliminates the cancellation problem • if minimize Σ(Y-Y')^2, minimize the total error of prediction – least-squares regression line: prediction line that minimizes the total error of prediction, according to the least-squares criterion of Σ(Y-Y')^2 – for any linear relationship, there's only one line that'll minimize Σ(Y-Y')^2 → there's only one least-squares regression line for each linear relationship – use the least-squares regression line because it gives the greatest overall accuracy in prediction + ie: another prediction line drawn in 7.2(b) + line picked arbitrarily and is just one of an infinite number that could've been drawn + does better on some points (A&B), bad on other (C&D) – consider all the points, it's clear that the line of (a) fits the points better than the lines of (b) + total error in prediction presented by Σ(Y-Y')^2 is less for least-squares regression line than for the line in (b) – the total error in prediction is less for the least-squares regression line than for any other possible prediction line → the least-squares regression line is used because it gives greater overall accuracy in prediction than any other possible regression line Constructing the Least-Squares Regression Line: Regression of Y on X – equation for least-squares regression line for predicting Y given X is: - the general equation of a straight line that we've been using all along + ay & by called regression constants - line called regression line of Y on X, or regression of Y on X → predicting Y given X – by regression constant is equal to - since we need by constant to determine ay constant, got to find by and then ay + once both found, substituted into the regression equation + ie: IQ and GPA (7.2) - equation for Y' can be used to predict the GPA knowing only the student's IQ score – suppose a student's IQ is 124; what's the GPA? → Y' = 0.074X – 7.006 = 0.074(124) – 7.006 = 2.17 Regresssion of X on Y – so far, predicting Y scores from X scores + derived a regression line that enabled us to predict Y given X → regression line of Y on X ;; also possible to predict X given Y – to predict X given Y, must derive a new regression line + cannot use regression equation for predicting Y given X – ie: involving IQ (X) and GPA (Y) + Y' = 0.074X – 7.006 + cannot use this to predict IQ given GPA → must derive new regression constants because old regression line derived to minimize errors in Y variable – minimizing Y' errors and minimizing X' errors will not lead to same regression lines + exception occurs when relationship is perfect rather than imperfect – both regression lines coincide, forming the single line that hits all the points + regression line for predicting X from Y some called regression line of X on Y – regions of X on Y – use IQ and GPA again + predict IQ (X) from GPA (Y) – linear regression equation for predicting X given Y – this line, along with the line predicting Y given X + two lines are different → expected when relationship is imperfect – although different equations do exist for computing the second regression line, they are seldom used + instead, it is common practice to designate the predicted variable as Y' and the given variable as X → if we wanted to predict IQ from GPA, we'd designate IQ as Y' variable and GPA as X variable and then use the regression equation for predicting Y given X Measuring Prediction Errors: The Standard Error of Estimate – regression line represents our best estimate of the Y scores given their corresponding X values + unless relationship between X and Y is perfect, most of actual Y values will not fall on the regression line – when relationship is imperfect, there will be necessarily be prediction errors + useful to know the magnitude of the errors – Quantifying prediction errors involves computing the standard error of estimate + standard error of estimate is much like standard deviation (gives a measure of the average deviation of the prediction errors about the regression line) + standard error gives a measure of the average deviation of the prediction errors about the regression line – can be considered as an estimate of the mean of the Y values which changes with X values + with standard deviation, the sum of the deviations, Σ(X-X(bar)), equaled zero – had to square the deviations to obtain a meaningful average (situation is same with standard error of estimate) + since sum of prediction errors, Σ(Y-Y(bar)), equals 0, we must square them also – average is then obtained by summing the squared values, dividing them by N-2, and taking the square root of the quotient – equation for standard error for estimate for predicting Y given X is + have divided by N-2 rather than N-1 as was done with standard deviation – determining by regression coefficient, already calculated the values for SSx and SSy - calculate standard error of estimate for grade point and IQ data (7.1, &.2) + shall let GPA be Y variable and IQ the X variable → calculate the standard error of estimate for predicting grade point average given IQ + SSx = 936.25 SSy = 7.022 ΣXY – (ΣX)(ΣY)/N = 69.375 N = 12 substituting in the equation for the standard error of estimate for predicting Y given X → standard error of estimate = 0.43 ;; measure has been computed over all the Y scores + for it to be meaningful, we must assume that the variability of Y remains constant as we go from one X score to the next – assumption of homoscedasticity
More Less

Related notes for PSY201H1

OR

Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Join to view

OR

By registering, I agree to the Terms and Privacy Policies
Just a few more details

So we can recommend you notes for your school.