Chapter 2

Chapter 2 (After Midterm)

University of Toronto Scarborough
Statistics
STAB22H3
Moras
Winter

Chapter 2.3- 2.6 2.3 Regression - explains the relationship between two variables only when one variable helps explain or predicts the other Regression line: a straight line which shows a response variable y changes as the explanatory x varable changes - Used to predict the value of y for a given value of x 2.12 Increase in energy use is an explanatory variable and fat gain is a response variable Fitting a line = line of best fit y = a + bx - where b is the slope and a is the intercept (when x=0) - b is the rate of change; as x changes by one unit, y changes by the slope value - y = prediction where as y = actual observation Extrapolation: use of the regression line for prediction far outside of the range of values - not so accurate because cannot say whether the graph continues to increase or decrease or whether the relationship remains linear at extreme values Least-squares Regression - y = a + bx - method of fitting a line to a scatter plot - it is not resistant to outliers - a line that is close as possible to all the points in the vertical direction - error = observed predicted - e > 0 when observed is greater than prediction and vice versa - makes prediction errors as small as possible - least-squares regression line of y on x is the line that makes the sum of the errors = 0 - sum of error: e1 +e2+e3 = 0 - sum of sq of errors = to the smallest value To calculate (do not round values that are further used in calculations): b = r (sy sx) a = mean yb(mean ) x Interpreting the regression - when changing units, correlation does not change however the least-squares line does change - least sq regression line always passes through the point: (x mean, y mean) - when mean = 0 and s = 1, the regression line passes through the origin and has a slope = r www.notesolution.com
