# STAB22H3 Study Guide - Final Guide: University Of Toronto Scarborough, Good Luck!!, Scantron Corporation

4 views96 pages

Published on 16 Sep 2020

School

Department

Course

Professor

UNIVERSITY OF TORONTO SCARBOROUGH

Department of Computer and Mathematical Sciences

Midterm Test October 2016

STAB22H3 Statistics I

Duration: 1 hour and 45 minutes

Last Name: First Name:

Student number:

Aids allowed:

- One handwritten letter-sized sheet (both sides) of notes prepared by you

- Non-programmable, non-communicating calculator

Standard normal distribution tables are attached at the end.

This test is based on multiple-choice questions. There are 35 questions. All questions carry

equal weight. On the Scantron answer sheet, ensure that you enter your last name, ﬁrst

name (as much of it as ﬁts), and student number (in “Identiﬁcation”).

Mark in each case the best answer out of the alternatives given (which means the nu-

merically closest answer if the answer is a number and the answer you obtained

is not given.)

Also before you begin, complete the signature sheet, but sign it only when the invigilator

collects it. The signature sheet shows that you were present at the exam.

There are 25 pages including this page and statistical tables. Please check to see you have

all the pages.

Good luck!!

ExamVersion: A

1

1. In a study about poverty, a researcher uses linear regression to predict the number of

murders per million from the percentage of persons living in poverty. She selects a

random sample of 200 metropolitan areas and obtains a regression line with intercept

-30 and slope 2.5 and a coeﬃcient of determination R2of 81%.

Which one of the following sentences is correct?

(a) There is a positive association between the percentage of persons living in

poverty and the number of murders per million because the coeﬃcient of deter-

mination is positive.

(b) There is a positive association between the percentage of persons living in

poverty and the number of murders per million because the slope is positive.

(c) We know for sure that an increase in the percentage of persons living in poverty

causes the number of murders per million to increase because the slope is posi-

tive.

(d) We know for sure that an increase in the percentage of persons living in poverty

causes the number of murders per million to decrease because the intercept is

negative.

(e) We know for sure that there is no association between the percentage of persons

living in poverty and the number of murders per million because the coeﬃcient

of determination is positive.

There is a positive association between the percentage of persons living in

poverty and the number of murders per million because the slope is positive.

However, we cannot make causal inference, i.e. it is impossible to say whether

an increase in the percentage of persons living in poverty causes the number of

murders per million to increase. Another mechanism could be be responsible

for the observed association between the number of murders per million and the

percentage of persons living in poverty.

2. The relationship between the number of games won by an NHL team and the average

attendance at their home games is analyzed. A regression to predict the average

attendance from the number of games won has an R2= 31.4%. The residuals plot

indicated that a linear model is appropriate. What is the correlation between the

average attendance and the number of games won.

(a) 0.099

(b) 0.560

(c) 0.314

(d) 0.686

(e) 0.828

Solution:

B: = √R2=√0.314 = 0.560357029

2

3. The relationship between the number of games won by an NHL team and the average

attendance at their home games is analyzed. A regression analysis to predict the

average attendance from the number of games won gives the model d

attendance =

−2,100 + 193wins. Predict the average attendance of a team with 58 wins.

(a) 11 people

(b) 9,094 people

(c) 13,294 people

(d) 11,194 people

(e) -1,849 people

Solution:

B: −2100 + 193 ×58 = 9094

4. A survey is conducted to ﬁnd the average weight of cows in a region. A list of all farms

is available for the region, and 50 farms are selected at random. Then the weight of

each cow at the 50 selected farms is recorded.

What is the name of the sampling method that was applied?

(a) Simple random sampling

(b) Stratiﬁed sampling

(c) Cluster sampling

(d) Systematic sampling

(e) None of the above

A sample of farms was randomly selected and a census was performed within

each farm, so cluster sampling was applied (where the clusters are the farms).

3