PYB110 Week Four Revision Notes
Regression Line or Line of Best Fit
Whilst correlations can tell us the direction and strength of the relationship between the variables,
the regression line can actually help us to plot the line that correlation metaphorically draws
through the data. We can then use this line to predict scores on our DV. The correlation coefficient
is a measure of how close the data falls to the regression line.
Why is it Called the Line of Best Fit?
It is called this because the line is drawn in a position that minimises the distances of all the data
points to the line. In other words, the line of best fit or regression line takes into account every data
point and is placed in a position which is as close as possible to as many points as it can be.
How can we Calculate the Regression Line?
In order to do so, we need two important pieces of information:
The slope of the line
o An indication of the gradient or ‘steepness’ of the line.
o AKA regression coefficient.
o Essentially it is the number of units of the Y variable (dependent) that increase for
every one unit that the X variable (independent) increases.
The slope of the line is calculated using this formula:
b X M )XY M Y
Essentially the top part of this formula is the same as the correlation formula. Then we are simply
dividing it by the sum of squared deviations for the X variable.
The Y-axis intercept
o The intercept is the point at which the regression line crosses (or intercepts) the Y
o It is the predicted value of Y when the value of X is zero.
o It can be positive or negative.
The Y-axis intercept is calculated using this formula:
a M (bY(M ) X
Here we are simply multiplying the slope (b) by the mean of our IV (X) and then subtracting that
from the mean of our DV (Y).
Once we have these two pieces of information, we are able to put them into a formula to calculate
how to draw our regression line.
The Regression Equation
Y a (b)(X) Using this formula, we can predict what a person’s score on the Y variable is likely to be if we know
their score on the X variable and the relationship between the X and Y variable. We can then plot