Class Notes (838,404)
Statistics (475)
STAT 331 (24)
Kun Liang (6)
Lecture

# Stat331 R-Tutorial3.pdf

27 Pages
149 Views

School
Department
Statistics
Course
STAT 331
Professor
Kun Liang
Semester
Fall

Description
Stat 331 Tutorial 3 House Price Example Kun Liang [email protected] M3 4201 Model diagnostics: Why? 1 2 3 4 y y y y 4 6 8 0 2 1 1 4 6 8 0 2 1 1 4 6 8 0 2 1 1 4 6 8 0 2 1 1 5 0151 5 0151 5 0151 5 0151 Anscombe 1973 data 1 2 3 4 y y y y 4 6 8 0 2 1 1 4 6 8 0 2 1 1 4 6 8 0 2 1 1 4 6 8 0 2 1 1 5 0151 5 0151 5 0151 5 0151 House price example I The objective of the example was to predict house price based on I size of home in square feet (Size) I number of bedrooms (Beds) I number of bathrooms (Baths) I whether New (1 = yes, 0 = no) I annual tax bill in dollars (Taxes). I The data are collected for 100 homes sold in Gainesville, Florida, fall 2006. I Consider a multiple regression model of the selling price (y) on three explanatory variables: Size (1 ), New (2 ), and Taxes (x ). 3 Read data > hp > head(hp) case Taxes Beds Baths New Price Size 1 1 3104 4 2 0 279900 2048 2 2 1173 2 1 0 146500 912 3 3 3076 4 2 0 237700 1654 4 4 1608 3 2 0 200000 2068 5 5 1454 3 3 0 159900 1477 6 6 2997 3 2 1 499900 3153 > > dim(hp) [1] 100 7 > hp plot(hp\$Size, hp\$Price, xlab="Size", ylab="Price") Price 0e+00 1e+05 2e+05 3e+05 4e+05 5e+05 6e+05 500 1000 1500 2000 2500 3000 3500 4000 Size Look at data > pairs(hp) 500 1500 2500 3500 0 2000 4000 6000 Price 0e+00 3e+05 6e+05 Size 500 2000 3500 New 0.0 0.4 0.8 Taxes 0 2000 5000 0e+00 2e+05 4e+05 6e+05 0.0 0.2 0.4 0.6 0.8 1.0 Even better > library(car) > scatterplotMatrix(hp) 500 1500 2500 3500 0 2000 4000 6000 Price 0e+00 3e+05 6e+05 Size 500 2000 3500 New 0.0 0.4 0.8 Taxes 0 2000 5000 0e+00 2e+05 4e+05 6e+05 0.0 0.2 0.4 0.6 0.8 1.0 Linear model Consider model Price = Size + New + Taxes + ▯ > fit summary(fit) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -21353.776 13311.487 -1.604 0.11196 Size 61.704 12.499 4.937 3.35e-06 *** New 46373.703 16459.019 2.818 0.00588 ** Taxes 37.231 6.735 5.528 2.78e-07 *** Residual standard error: 47170 on 96 degrees of freedom Multiple R-squared: 0.7896, Adjusted R-squared: 0.783 F-statistic: 120.1 on 3 and 96 DF, p-value: < 2.2e-16 Residual vs ﬁtted > plot(fitted(fit), rstudent(fit), xlab="fitted", ylab="residuals") studentized residual −4 −2 0 2 4 0e+00 1e+05 2e+05 3e+05 4e+05 fitted Transformation? > library(MASS) > boxcox(fit, lambda = seq(-1, 1, 1/20)) 95% log−Likelihood −180 −160 −140 −120 −100 −1.0 −0.5 0.0 0.5 1.0 λ Linear model p Price = Size + New + Taxes + ▯ > fit2 > summary(fit2) Residuals: Min 1Q Median 3Q Max -155.478 -33.695 4.374 31.780 145.241 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.887e+02 1.450e+01 13.012 < 2e-16 *** Size 6.084e-02 1.362e-02 4.468 2.15e-05 *** New 4.712e+01 1.793e+01 2.628 0.01 * Taxes 4.476e-02 7.337e-03 6.101 2.21e-08 *** Residual standard error: 51.39 on 96 degrees of freedom Multiple R-squared: 0.791, Adjusted R-squared: 0.7844 F-statistic: 121.1 on 3 and 96 DF, p-value: < 2.2e-16 Residual vs Size > plot(hp\$Size, rstudent(fit2), xlab="Size", ylab="residuals") residuals −3 −2 −1 0 1 2 3 500 1000 1500 2000 2500 3000 3500 4000 Size Residual vs Taxes > plot(hp\$Taxes, rstudent(fit2), xlab="Size", ylab="residuals") residuals −3 −2 −1 0 1 2 3 0 1000 2000 3000 4000 5000 6000 Taxes Residual vs ﬁtted > plot(fitted(fit2), rstudent(fit2), xlab="fitted", ylab="residuals") residuals −3 −2 −1 0 1 2 3 300 400 500 600 700
More Less

Related notes for STAT 331
Me

OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Join to view

OR

By registering, I agree to the Terms and Privacy Policies
Just a few more details

So we can recommend you notes for your school.