STAT 302 Midterm: STAT 302 UW Madison Solutions 08

49 views16 pages
31 Jan 2019
Department
Course
Professor
Solutions to Homework 8
Statistics 302 Professor Larget
Textbook Exercises
6.12 Impact of the Population Proportion on SE Compute the standard error for sam-
ple proportions from a population with proportions p= 0.8, p = 0.5, p = 0.3,and p= 0.1 using a
sample size of n= 100. Comment on what you see. For which proportion is the standard error the
greatest? For which is it the smallest?
Solution
We compute the standard errors using the formula:
p= 0.8 : SE =rp(1 p)
n=r0.8(0.2)
100 = 0.040
p= 0.5 : SE =rp(1 p)
n=r0.5(0.5)
100 = 0.050
p= 0.3 : SE =rp(1 p)
n=r0.3(0.7)
100 = 0.046
p= 0.1 : SE =rp(1 p)
n=r0.1(0.9)
100 = 0.030
The largest standard error is at a population proportion of 0.5 (which represents a population split
50-50 between being in the category we are interested in and not begin in). The farther we get from
this 50-50 proportion, the smaller the standard error is. Of the four we computed, the smallest
standard error is at a population proportion of 0.1.
Standard Error from a Formula and a Bootstrap Distribution In exercise 6.20, use Statkey
or other technology to generate a bootstrap distribution of sample proportions and find the stan-
dard error for that distribution. Compare the result to the standard error given by the Central
Limit Theorem, using the sample proportion as an estimate of the population proportion.
6.20 Proportion of home team wins in soccer, with n= 120 and ˆp= 0.583.
Solution
Using StatKey or other technology to create a bootstrap distribution, we see for one set of 1000
simulations that SE = 0.045. (Answers may vary slightly with other simulations.) Using the
formula from the Central Limit Theorem, and using ˆp= 0.583 as an estimate for p, we have
SE =rp(1 p)
nr0.583(1 .583)
120 = 0.045
We see that the bootstrap standard error and the formula match very closely.
6.38 Home Field Advantage in Baseball There were 2430 Major League Baseball (MLB)
games played in 2009, and the home team won in 54.9% of the games. If we consider the games
played in 2009 as a sample of all MLB games, find and interpret a 90% confidence interval for the
proportion of games the home team wins in Major League Baseball.
Solution
To find a 90% confidence interval for p, the proportion of MLB games won by the home team, we
1
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 16 pages and 3 million more documents.

Already have an account? Log in
use z= 1.645 and ˆp= 0.549 from the sample of n= 2430 games. The confidence interval is
Sample statistic ±z·SE
ˆp±zrˆp(1 ˆp)
n
0.549 ±1.645r0.549(0.451)
2430
0.549 ±0.017
0.532 to 0.566
We are 90% confident that the proportion of MLB games that are won by the home team is between
0.532 and 0.566. This statement assumes that the 2009 season is representative of all Major League
Baseball games. If there is reason to assume that that season introduces bias, then we cannot be
confident in our statement.
6.50 What Proportion Favor a Gun Control Law? A survey is planned to estimate the
proportion of voters who support a proposed gun control law. The estimate should be within a
margin of error of ±2% with 95% confidence, and we do not have any prior knowledge about the
proportion who might support the law. How many people need to be included in the sample?
Solution
The margin of error we desire is ME = 0.02, and for 95% confidence we use z= 1.96. Since we
have no prior knowledge about the proportion in support p, we use the conservative estimate of
˜p= 0.5. We have:
n=z
ME 2
˜p(1 ˜p)
=1.96
0.022
0.5(1 0.5)
= 2401
We need to include 2, 401 people in the survey in order to get the margin of error down to within
±2%.
6.64 Home Field Advantage in Baseball There were 2430 Major League Baseball (MLB)
games played in 2009, and the home team won the game in 54.9% of the games. If we consider
the games played in 2009 as a sample of all MLB games, test to see if there is evidence, at the 1%
level, that the home team wins more than half the games. Show all details of the test.
Solution
We are conducting a hypothesis test for a proportion p, where pis the proportion of all MLB games
won by the home team. We are testing to see if there is evidence that p > 0.5, so we have
H0:p= 0.5
Ha:p > 0.5
This is a one-tail test since we are specifically testing to see if the proportion is greater than 0.5.
2
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 16 pages and 3 million more documents.

Already have an account? Log in
The test statistic is:
z=Sample Statistic Null parameter
SE =ˆpp0
qp0(1p0)
n
=0.549 0.5
q0.5(0.5)
2430
= 4.83.
Using the normal distribution, we find a p-value of (to five decimal places) zero. This provides
very strong evidence to reject H0and conclude that the home team wins more than half the games
played. The home field advantage is real!
6.70 Percent of Smokers The data in Nutrition Study, introduced in Exercise 1.13 on page 13,
include information on nutrition and health habits of a sample of 315 people. One of the variables
is Smoke, indicating whether a person smokes or not (yes or no). Use technology to test whether
the data provide evidence that the proportion of smokers is different from 20%.
Solution
We use technology to determine that the number of smokers in the sample is 43, so the sample
proportion of smokers is ˆp= 43/315 = 0.1365. The hypotheses are:
H0:p= 0.20
Ha:p6= 0.20
The test statistic is:
z=Sample Statistic Null Parameter
SE =ˆpp0
qp0(1p0)
n
=0.1365 0.20
q0.2(0.8)
325
=2.82
This is a two-tail test, so the p-value is twice the area below -2.82 in a standard normal distribution.
We see that the p-value is 2(0.0024) = 0.0048. This small p-value leads us to reject H0. We find
strong evidence that the proportion of smokers is not 20%.
6.84 How Old is the US Population? From the US Census, we learn that the average age of all
US residents is 36.78 years with a standard deviation of 22.58 years. Find the mean and standard
deviation of the distribution of sample means for age if we take random samples of US residents of
size:
(a) n= 10
(b) n= 100
(c) n= 1000
Solution
(a) The mean of the distribution is 36.78 years old. The standard deviation of the distribution of
sample means is the standard error:
SE =σ
n=22.58
10 = 7.14
(b) The mean of the distribution is 36.78 years old. The standard deviation of the distribution of
sample means is the standard error:
SE =σ
n=22.58
100 = 2.258
3
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 16 pages and 3 million more documents.

Already have an account? Log in

Document Summary

The largest standard error is at a population proportion of 0. 5 (which represents a population split. 50-50 between being in the category we are interested in and not begin in). The farther we get from this 50-50 proportion, the smaller the standard error is. Of the four we computed, the smallest standard error is at a population proportion of 0. 1. Standard error from a formula and a bootstrap distribution in exercise 6. 20, use statkey or other technology to generate a bootstrap distribution of sample proportions and nd the stan- dard error for that distribution. Limit theorem, using the sample proportion as an estimate of the population proportion. 6. 20 proportion of home team wins in soccer, with n = 120 and p = 0. 583. Using statkey or other technology to create a bootstrap distribution, we see for one set of 1000 simulations that se = 0. 045. (answers may vary slightly with other simulations. )