STAT1008 Study Guide - Final Guide: Confidence Interval, Observational Error, Dependent And Independent Variables

80 views3 pages
17 May 2018
School
Department
Course
Professor
Inference for Slope and Correlation
Simple Linear Model
The population/true simple linear model is:
0 and 1 are unknown parameters.
Estimate the least square line:  
Inference for the Slope
When the conditions for a simple linear model are reasonably met, we find:
Confidence Interval for Slope
Confidence interval for the population slope = 
Where b1 is the slope for the least squares line for the sample and SE is the standard error of
the slope.
t* uses n-2 degrees of freedom
T-Test For Correlation
Test statistic:    

H0: 1 = 0 vs Ha: 1 0
We are testing for the significance of the regression - "Is X an important predictor of Y".
We estimate the SE with bootstrap/randomisation distributions.
Bootstrap from the original data with replacement and fit the regression line to the new data.
Test for Slope
  
Ho: 1=0 no linear relationship
Ha: 1 0 (or 1-tail) some relationship
 

b1 and SE come from computer output.
Find p-value using t-distribution with n-2 df.
Test for Correlation
Ho: ρ=0
Ha: ρ0 (or 1-tail)
 

The t-test for slope and t-test for correlation are identical.
Coefficient of Determination, R2
Recall that for correlation: -1 r 1.
If we square the correlation, r2, we get a number between 0 and 1 that can be interpreted as a
percentage.
R2 = proportion of variability in response variable Y that is "explained" by the model based on
the predictor X.
Checking Conditions for a Simple Linear Model
For a simple linear model, we assume the errors (ε) are randomly distributed above and below
the line.
Look at a scatterplot with regression line on it.
Watch out for:
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows page 1 of the document.
Unlock all 3 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Confidence interval for slope the slope. t* uses n-2 degrees of freedom: the population/true simple linear model is, 0 and 1 are unknown parameters, estimate the least square line: (cid:1877) =(cid:2868)+(cid:2869)(cid:1876, confidence interval for the population slope = (cid:2869) . Test statistic: =(cid:3117) (cid:3041) (cid:3046)(cid:3042)(cid:3043: h0: 1 = 0 vs ha: 1 0, (cid:1877)=(cid:2868)+(cid:2869)(cid:1876)+ =(cid:3117: ho: =0, ha: 0 (or 1-tail) For a simple linear model, we assume the errors ( ) are randomly distributed above and below the line. 9. 2 anova for regression (analysis of variance: y=(cid:882)+(cid:883)+ data = model + error. 2 y (ssmodel) by the model: h0: (cid:2869)=(cid:882) (model is ineffective, hq: (cid:2869) (cid:882) (model is effective) F = msmodel/msw: to find a p-value for the anova f-statistic, create a randomisation distribution (keep one variable fixed and randomly reorder the other variable), or, use a theoretical distribution. F-distribution has degrees of freedom for both the numerator and the denominator.