STAT3012 Lecture Notes - Lecture 17: Categorical Variable, Covariate, Scatter Plot
Lecture 17 – Quantitative factors
New concepts
✷Quantitative factor
✷Polynomial regression and ANOVA
✷Nesting of linear effects
✷Bartlett test to assess homoscedasticity assumption
Applied Linear Models: Lecture 17 1
find more resources at oneclass.com
find more resources at oneclass.com
New topic – Quantitative factors
Theory – Factor or numerical variables?
Sometimes it is unclear as to whether a particular explanatory variable should be
regarded as a factor (categorical explanatory variable) or a numerical covariate.
We will see that polynomial regression is the key to understand the problem.
Example – Drug levels
Should the effect of a drug be modelled using a
✷3 level factor
(low, medium and high doses) ⇒1-way ANOVA
or as
✷numerical variable
(0.5, 1.0, 2.0 mg doses) ⇒regression ?
Applied Linear Models: Lecture 17 2
find more resources at oneclass.com
find more resources at oneclass.com
Theory – Polynomial regression
✷Suppose the ith treatment corresponds to a measurement xi∈R.
✷We can
◦plot the sample mean response at each value xiby plotting (xi, Y i•),
◦produce boxplots at each value xi.
✷If we have ttreatments then we have tpoints on the plot.
✷By looking at the way the mean values vary with xiwe can put forward a model
for E(Y|x).
✷Through any tpoints we can fit a polynomial of degree (t−1).
Thus the most general polynomial regression model for this situation is
Yij =β0+β1xi+. . . +βt−1xt−1
i+ǫij, ǫij ∼NID(0, σ2).(1)
Applied Linear Models: Lecture 17 3
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Sometimes it is unclear as to whether a particular explanatory variable should be regarded as a factor (categorical explanatory variable) or a numerical covariate. We will see that polynomial regression is the key to understand the problem. Should the e ect of a drug be modelled using a. 3 level factor (low, medium and high doses) 1-way anova or as. Suppose the ith treatment corresponds to a measurement xi r. We can: plot the sample mean response at each value xi by plotting (xi, y i ), produce boxplots at each value xi. If we have t treatments then we have t points on the plot. By looking at the way the mean values vary with xi we can put forward a model for e(y |x). Through any t points we can t a polynomial of degree (t 1). Thus the most general polynomial regression model for this situation is i + ij, ij n id(0, 2).