STAT3012 Lecture Notes - Lecture 11: Gross Domestic Product, Feature Selection, Stepwise Regression
Lecture 11 - Variable selection: Stepwise, AIC and BIC
New concepts
✷Stepwise variable selection
✷The step command
✷The AIC, Cp and BIC variable selection criterion
Applied Linear Models: Lecture 11 1
find more resources at oneclass.com
find more resources at oneclass.com
Theory – Stepwise variable selection
1. Start with some model, typically null model (with no explanatory variables) or
full model (with all variables).
2. For each variable in the current model, investigate effect of removing it.
3. Remove the least informative variable, unless this variable is nonetheless supply-
ing significant information about the response.
4. For each var. not in the current model, investigate effect of including it.
5. Include the most statistically significant variable not currently in model (unless
no significant variable exists).
6. Go to step 2. Stop only if no change in steps 2–5.
✷In R for F-tests: use a combination of add1() and drop1().
✷In R for using AIC or BIC: use command step, which runs an automated search.
Applied Linear Models: Lecture 11 2
find more resources at oneclass.com
find more resources at oneclass.com
Example – Cheese data: Stepwise
First pass through algorithm
We will perform stepwise variable selection starting from the ‘null model’ (i.e. the
model with no explanatory variables) using the following exclusion/inclusion level of
significance:
pout = 0.20 and pin = 0.10.
M1 <- lm(taste ~1,data = dat) # null model
Mf <- lm(taste ~., data = dat) # full model
Applied Linear Models: Lecture 11 3
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Lecture 11 - variable selection: stepwise, aic and bic. The aic, cp and bic variable selection criterion. Stop only if no change in steps 2 5. In r for f-tests: use a combination of add1() and drop1(). In r for using aic or bic: use command step, which runs an automated search. We will perform stepwise variable selection starting from the null model" (i. e. the model with no explanatory variables) using the following exclusion/inclusion level of signi cance: pout = 0. 20 and pin = 0. 10. M1