October 2016
UNIVERSITY OF TORONTO SCARBOROUGH
Department of Computer and Mathematical Sciences
Midterm Test October 2016
STAB22H3 Statistics I
Duration: 1 hour and 45 minutes
Aids allowed:
- One handwritten letter-sized sheet (both sides) of notes prepared by you
- Non-programmable, non-communicating calculator
Standard normal distribution tables are attached at the end.
This test is based on multiple-choice questions. There are 35 questions. All questions carry equal weight.
1. In a study about poverty, a researcher uses linear regression to predict the number of
murders per million from the percentage of persons living in poverty. She selects a
random sample of 200 metropolitan areas and obtains a regression line with intercept
-30 and slope 2.5 and a coeﬃcient of determination R2of 81%.
Which one of the following sentences is correct?
(a) There is a positive association between the percentage of persons living in
poverty and the number of murders per million because the coeﬃcient of deter-
mination is positive.
(b) There is a positive association between the percentage of persons living in
poverty and the number of murders per million because the slope is positive.
(c) We know for sure that an increase in the percentage of persons living in poverty
causes the number of murders per million to increase because the slope is posi-
tive.
(d) We know for sure that an increase in the percentage of persons living in poverty
causes the number of murders per million to decrease because the intercept is
negative.
(e) We know for sure that there is no association between the percentage of persons
living in poverty and the number of murders per million because the coeﬃcient
of determination is positive.
There is a positive association between the percentage of persons living in
poverty and the number of murders per million because the slope is positive.
However, we cannot make causal inference, i.e. it is impossible to say whether
an increase in the percentage of persons living in poverty causes the number of
murders per million to increase. Another mechanism could be be responsible
for the observed association between the number of murders per million and the
percentage of persons living in poverty.
2. The relationship between the number of games won by an NHL team and the average
attendance at their home games is analyzed. A regression to predict the average
attendance from the number of games won has an R2= 31.4%. The residuals plot
indicated that a linear model is appropriate. What is the correlation between the
average attendance and the number of games won.
(a) 0.099
(b) 0.560
(c) 0.314
(d) 0.686
(e) 0.828
Solution:
B: = R2=0.314 = 0.560357029
2
3. The relationship between the number of games won by an NHL team and the average
attendance at their home games is analyzed. A regression analysis to predict the
average attendance from the number of games won gives the model d
attendance =
2,100 + 193wins. Predict the average attendance of a team with 58 wins.
(a) 11 people
(b) 9,094 people
(c) 13,294 people
(d) 11,194 people
(e) -1,849 people
Solution:
B: 2100 + 193 ×58 = 9094
4. A survey is conducted to ﬁnd the average weight of cows in a region. A list of all farms
is available for the region, and 50 farms are selected at random. Then the weight of
each cow at the 50 selected farms is recorded.
What is the name of the sampling method that was applied?
(a) Simple random sampling
(b) Stratiﬁed sampling
(c) Cluster sampling
(d) Systematic sampling
(e) None of the above
A sample of farms was randomly selected and a census was performed within
each farm, so cluster sampling was applied (where the clusters are the farms).
3
