STA302H1 Midterm: STA302 University of Toronto St George Midterm Cheat Sheet Alfred Benn

200 views2 pages
22 Jun 2018
School
Department
Course
Professor
I. Simple Linear Regression Model
1. Model: , ,   
Parameters: , ,
Variables: X (known/explanatory/predictor variable)
Y (known random/response/dependent variable)
(random error in )
 ,     
2. Model Assumptions:
i. y is related to x by the simple linear regression model
 
i.e.,     
ii. The errors are independent of each other
iii. The errors have a common variance
iv. The errors are normally distributed with a mean of 0 and
variance , that is 
v. Values of the predictor variable are known
fixed constants
3. Residual
i.
ii. 


iii. Usually,
   . To minimize the least square
estimates (if every point is on the least square line),
 
    
4. Least Square Estimates
i.
ii.






 

*
iii. Estimate of : 



(2 parameters: , )
5. Inference of ,
i.

 

, 

(
is a linear combination of )
,

(
is an unbiased estimator of )


,





Hypothesis test: whether x and y have linear relationship
,
if is true,

;
Reject when

,
or 

 confidence interval for :
 
 
ii.
,

(
is an unbiased estimator of )


,





Hypothesis test
*If is true,


 confidence interval for :
 
 
6. Confidence Interval for the Population Regression Line (at a
given value of x*)
,  


 

 
  







 confidence interval for y*:



7. Prediction Interval for the Actual Value of y
i. Confidence interval: reported for a parameter (, )
   


Prediction interval: reported for the value of a random variable
(value range of y*)


*Prediction interval is wider than the confidence interval
ii.




 


 prediction interval for Y*:



8. Analysis of Variance: to test whether there is a linear association
between y and x (ANOVA, using F-test)
i. F-test is for multiple linear regression cases, but can also fit the
simple linear regression model
ii. Hypothesis test ;
*If is true, 
,
*Reject at level if 
iii. Total sample variability: 
, 
Variability explained by the model: 
,
 ,  
Unexplained (or error) variability: 
,
 ,  
*  ,
iv. 
 
,
9. Pearson (Sample) Correlation Coefficient: symmetric measure of
linear association between x and y
i. 


,
ii. : fall exactly on line
: no linear relationship
: positive relationship between x and y
: negative relationship between x and y







*

 slope:
and
iii. : occur in simple linear regression only
*







iv. Given ,
and
and
*
II. Diagnostics and Transformation for Simple Linear Regression
1. Regression Diagnostics: Tools for Checking the Validity of a
Model: i) Standardized residual plots: models validity; ii)
Whether there are leverage points and outliers; iii) If leverage
points exist, determine whether each is a bad leverage point
(assess its influence on the line); iv) Whether the assumption of
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows half of the first page of the document.
Unlock all 2 pages and 3 million more documents.

Already have an account? Log in

Document Summary

The errors are normally distributed with a mean of 0 and. Least square estimates: simple linear regression model estimates (if every point is on the least square line), y is related to x by the simple linear regression model, variables: x (known/explanatory/predictor variable) = (cid:4666)(cid:1877)(cid:3036) (cid:1877)(cid:3114) (cid:4667)(cid:2870) (cid:3041)(cid:3036)=(cid:2869) (cid:3041)(cid:3036)=(cid:2869: usually, (cid:1857)(cid:3114) (cid:2870) = (cid:4666)(cid:3051)(cid:3284) (cid:3051) (cid:4667)(cid:3052)(cid:3284) (cid:3051)(cid:3284)(cid:3052)(cid:3284) (cid:3041)(cid:3051) (cid:3052) (cid:3284)=(cid:3117) (cid:4666)(cid:3051)(cid:3284) (cid:3051) (cid:4667)(cid:3118) (cid:3284)=(cid:3117) (cid:4666)(cid:3051)(cid:3284) (cid:3051) (cid:4667)(cid:3118) (cid:3284)=(cid:3117) (cid:4666)(cid:3051)(cid:3284) (cid:3051) (cid:4667)(cid:3118) (cid:3284)=(cid:3117) (cid:3284)=(cid:3117) (cid:3284)=(cid:3117) = (cid:2869)(cid:3041) (cid:2870) (cid:1857)(cid:3114) (cid:2870: estimate of (cid:2870): (cid:1871)(cid:2870)= (cid:4666)(cid:3052)(cid:3284) (cid:3052)(cid:3362) (cid:4667)(cid:3118) (cid:3041)(cid:3036)=(cid:2869) (cid:3284)=(cid:3117)(cid:3041) (cid:2870) (2 parameters: (cid:2868), (cid:2869)) , (cid:1855)(cid:3036)=(cid:3051)(cid:3284) (cid:3051) (cid:3041)(cid:3036)=(cid:2869) (cid:3284)=(cid:3117) (cid:4666)(cid:3051)(cid:3284) (cid:3051) (cid:4667)(cid:3118) (cid:3020)(cid:3025)(cid:3025) ((cid:2869) is a linear combination of (cid:1877)(cid:3036)) (cid:3284)=(cid:3117: (cid:4666)(cid:2869) |x(cid:4667)=(cid:2869), var(cid:4666)(cid:2869) |x(cid:4667)= (cid:3118)(cid:3020)(cid:3025)(cid:3025) ((cid:2869) is an unbiased estimator of (cid:2869)) (cid:2868):(cid:2869)=(cid:882) (cid:4666)(cid:1876) (cid:1866)(cid:1856) (cid:1877) (cid:1857) (cid:1866)(cid:1867) (cid:1861)(cid:1866)(cid:1857)(cid:1870) (cid:1870)(cid:1857)(cid:1872)(cid:1861)(cid:1867)(cid:1866)(cid:1871) (cid:1861)(cid:1868)(cid:4667), if (cid:2868) is true, t= (cid:3081)(cid:3117) (cid:3046)(cid:4666)(cid:3081)(cid:3117) (cid:4667)~(cid:1872)(cid:3041) (cid:2870); (cid:2869): (cid:2869) (cid:882)

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers