Where are we?
• Ch. 2–9: Data set and its distribution, statistics, Normal model.
• Ch. 11–13: Collecting data, random samples, randomized
• Ch. 14–16: Probability, random variables, probability
• Ch. 18–28: Statistical inference: what does sample data tell us
about the underlying population. Inferences about parameters
(proportions, means, etc.) in a model for the population
Quick Review of Ch 18:
The Sampling Distribution of a Sample Proportion
Suppose we are just interested in one characteristic occurred in
the population of interest. For convenience, we will call the
outcome we are looking for “Success”. The population
proportion, p, is obtained by taking the ratio of the number of
successes in a population to the total number of elements in the
1 of 79 1) p is the mean of the sampling distribution of pˆequals p:
Notation: pˆ is also denoted as
2) The standard deviation of the sampling distribution of p ,
Notation: p is also denoted as SD p .
3) When n is large and p is not too close to 0 or 1, the sampling
distribution of pˆ is approximately normal.
Rule of thumb: The sample size is considered to be
sufficiently large if
np ≥ 10 and n(1 – p) ≥ 10
The Sampling Distribution of a Sample Mean
If y is the sample average of an SRS of size n drawn from a
population with mean μ and standard deviation σ, then:
1. The population mean of y , y , is equal to μ.
2 of 79 y
NOTATION: textbook denoted y as
2. The population standard deviation of y, denoted y, is
y n ,
NOTATION: textbook denoted y as SD y
3. If random samples of n observations are drawn from any
normal distributed population with mean μ and standard
deviation σ, then the sampling distribution of the mean y is
normal distributed, with mean μ and standard deviation .
4. CLT: If random samples of n observations are drawn from
any population with mean μ and standard deviation σ, then
for large n (ie. n ≥ 30), the sampling distribution of the mean
y is normal distributed, with mean μ and standard deviation
Ch 19 Confidence Intervals for Proportions
We will be starting now to cover inferential statistics! Its
objective is to use sample data to obtain results about the whole
3 of 79 In a first step, the goal is to describe an underlying population.
Since the populations are described in form of models that are
characterized by parameters (mean μ and standard deviation σ or
probability p for the event of interest).
At this time we will estimate those characteristics. There are two
different approaches for estimating: Point Estimation and
For Point Estimation, you give one value for a characteristic,
which is hopefully close to the true unknown value.
For Interval Estimation, you give an interval of likely values,
where the width of the interval will depend on the confidence
you require to have in this interval.
Since we base our statement just on a sample, we see later how
to give a measure of accuracy or confidence for the estimate.
A point estimate of a possible characteristic is a single number
that is based on sample data and represents the population
4 of 79 Example:
- The sample mean y is a point estimate for the population
- The sample proportion p is a point estimate of p the
population probability for Success.
A point estimate gives a single value that is supposed to be close
to the true value of the characteristic but it does NOT tell how
close the estimate is.
Considering we know that we would observe different values of
a point estimate from sample to sample, point estimates are not
enough to describe a parameter. Thus, we introduce the second
type of estimate – interval estimate.
As an alternative to point estimation we can report not just a
single value for the population characteristic, but an entire
interval of reasonable values based on sample data. These
intervals take into account of error and uncertainty. We often
5 of 79 associate interval estimate with some level of confidence and the
result is called a confidence interval.
Recall: Both of the sampling distributions for proportions
and means are Normal.
For proportions : p
When we don’t know p or σ, we’re stuck, right?
No, we will use sample statistics to estimate these
Whenever we estimate the standard deviation of a
sampling distribution, we call it a standard error.
The standard error for a sample proportion:
SE(p) p(1 p)
The standard error for the sample mean: SE y
6 of 79 By the 68-95-99.7% Rule, we know
- about 68% of all samples will have p ’s within 1 SE of p
- about 95% of all samples will have p ’s within 2 SEs of p
- about 99.7% of all samples will have ’s within 3 SEs of p
We can look at this from ’s point of view…
Consider the C = 95% level:
- There’s a 95% chance that p is no more than 2 SEs away
- So, if we reach out 2 SEs, we are 95% sure that p will be in
that interval. In other words, if we reach out 2 SEs in either
direction of pˆ, we can be 95% confident that this interval
contains the true proportion.
This is called a 95% confidence interval.
What does “95% Confidence” really mean?
Being "95% confident" means, if you were to construct 100
95% confidence intervals from 100 different samples. Of
the 100 intervals, you expect 95 to capture the true mean,
and 5 not to capture the mean.
7 of 79 How can this happen?
Each confidence interval uses a sample statistic to estimate
a population parameter, but since samples vary, the
statistics we use, and thus the confidence intervals we
construct, vary as well.
In conclusion, you cannot be sure that a specific confidence
interval captures the true proportion p.
Our confidence is in the process of constructing the
interval, not in any one interval itself.
The following figure shows that some of our confidence
intervals (from 20 random samples) capture the true
proportion (the horizontal line), while others do not:
8 of 79 Margin of Error
- Most confidence intervals are of the form:
Point estimate ± margin of error
= point estimate ± critical value × SE(estimate)
- The more confident we want to be, the larger our margin of
error needs to be (makes the interval wider).
- We need more values in our confidence interval to be more
- Because of this, every confidence interval is a balance
between certainty and precision.
- The tension between certainty and precision is always
- Fortunately, in most cases we can be both sufficiently
certain and sufficiently precise to make useful statements.
- The most commonly chosen confidence levels (C) are 90%,
95%, and 99% (but any percentage can be used).
- The critical value is how far we need to deviate from the
estimate to capture the central 100C% of the values on the
9 of 79 - The ‘2’ in p 2SE(p) (our 95% confidence interval)
came from the 68-95-99.7% Rule.
- Using a table or technology, we find that a more exact
value for our 95% confidence interval is 1.96 instead of 2.
- We call 1.96 the critical value and denote it z*.
Example: To find the central 95% region on a standard
normal curve, you need to cut off 2.5% at each end.
The z* value for C = 0.95 has 97.5% of the area to the left.
Using z-table, we find z* = 1.96.
- For any confidence level, we can find the corresponding
- Commonly used critical values
Confidence Coefficient C 1 – C (1 – C)/2 z*
0.90 0.1 0.05 1.645
0.95 0.05 0.025 1.96
0.99 0.01 0.005 2.58
Example: Show that for a 90% confidence interval, the critical
value is 1.645.
10 of 79 Example:
Consider flipping an unbiased coin 1000 times. The results
showed that you flipped 400 heads. Based on this result, what
interval captures the most likely 95% of the values of the actual
proportion of heads? Then check if the coin is fair.
A 100C% Large Sample Confidence Interval for a
Population Proportion p.
Here are the assumptions and the corresponding conditions
you must check before creating a confidence interval for a
1) Independence Assumption: You cannot check this by
looking at the data. Instead, we check two conditions to
decide whether independence is reasonable.
Randomization Condition: Were the data sampled at
random or generated from a properly randomized
experiment? Proper randomization can help ensure
10% Condition: Is the sample size no more than 10%
of the population?
11 of 79 2) Sample Size Assumption: The sample needs to be large
enough for us to be able to use the CLT.
- Success/Failure Condition: We must expect at least 10
“successes” and at least 10 “failures”
- When the conditions are met, we are ready to find the
confidence interval for the population proportion, p.
- The confidence interval is
p z*SE(p) p z*
- The critical value, z*, depends on the particular confidence
level, C, that you specify.
To answer this question, we will calculate a 95% confidence
interval from this data and check if 0.5 (the probability for
HEAD, when tossing an unbiased coin) is in the confidence
12 of 79 NOTE:
Do NOT say:
- There is a probability of 0.95 that p is between 0.37 and
- Between 37% and 43% of all tosses appear as HEADs.
- 95% of all random samples of coin tosses will show
between 37% and 43% of HEADs.
- We can be 95% confident that all random samples will
show 40% of HEADs.
- There is a 95% chance that the true proportion of HEADs is
between 37% and 43%.
13 of 79 Example:
For a project, a student randomly sampled 182 other students at
a large university to determine if the majority of students were
in favor of a proposal to build a field house. He found that 75
were in favor of the proposal. Find the 95% confidence interval
Find the 99% confidence interval for p.
Remark: In order to have a higher confidence, we need to
accept a larger margin of error, ie. a wider interval.
14 of 79 A Confidence Interval for Small Samples
When the Success/Failure Condition fails, all is not lost.
A simple adjustment to the calculation lets us make a
confidence interval anyway.
All we do is add four observations, two successes and two
So instead of p , we use the adjusted proportion
~ p(1 p)
Now the adjusted interval is p z* n
The adjusted form gives better performance overall and
works much better for proportions of 0 or 1.
Choosing the sample size
Recall: the margin of error in the CI for p is: ME z*
We may like to choose the sample size n to achieve a certain
margin of error, so we solve for n:
n z* p(1 p)
15 of 79 - p is based on a pilot study or on past experience, but we
may not have prior sampling done!
o use p = 0.50. This is conservative as it gives a
margin of error bigger than the true margin of error.
If a TV executive would like to find a 95% confidence interval
estimate within 0.03 for the proportion of all households that
watch NYPD Blue regularly. How large a sample is needed if a
prior estimate for p was 0.15?
Suppose a TV executive would like to find a 95% confidence
interval estimate within 0.03 for the proportion of all households
that watch NYPD Blue regularly. How large a sample is needed
if we have no reasonable prior estimate for ?
16 of 79 Example (Please try it on your own):
To conduct a political poll that is 99% sure of finding the level
of support for the Conservative party to within 0.01 of margin of
error, how large a sample would we need?
z* 2 2.576 2
n p*(1 p*) 0.5(10.5) 16589.44
Thus, to be 99% confidence of finding the level of support for
the Conservative party to within 0.01, we need 16590 samples.
17 of 79 Ch 23 Inference About Mean
Now that we know how to create confidence intervals and test
hypotheses about proportions, it’d be nice to be able to do the
same for means.
Just as we did before, we will base both our confidence interval
and our hypothesis test on the sampling distribution model.
Recall: If we use the statistic for estimating the population
mean μ, we can use the following information from the CLT in
order to obtain a confidence interval for μ.
y standard deviation of y,
The standard error of y isSE(y) AND
If the population distribution is originally normal, then the
sampling distribution is also normal OR
If the population distribution is non normal, but it has n ≥
30, then we can assume that the sampling distribution of
is approximately normal.
18 of 79 Gosset’s t
Until now, all statistical tools that were introduced were based
on the assumption that population standard deviation is
known. In practice, this assumption is very artificial and is
never fulfilled in any real live situation.
All procedures introduced until now are based on the normal
distribution, which requires the population standard deviation .
In most situations, is unknown and has to be replaced by the
sample standard deviation s, it causes variability in the result. In
order to calculate a confidence interval, we need to fix the
problem of variability by introducing another distribution called
the Student’s t-distribution.
The t-distribution only depends on one parameter, which is
called the degrees of freedom (df).
Properties of the t-distribution:
- its density curves look quite similar to the standard normal
curve. They are symmetric about 0, single-peaked, and
- The spread of the t-distributions is a bit larger than that of
the standard normal curve. (As we are now using an
19 of 79 estimate for the population standard deviation, we must
accept slightly more error in our estimation.)
- As degrees of freedom (d.f.) gets bigger, the t-density curve
gets closer to the standard normal density curve. (NOTE:
Table t) In another words, as degrees of freedom increases,
the spread of the corresponding t density curve decreases.
- In fact, the t-model with infinite df is exactly normal.
Remark: The structure of the table is different than the table
for the standard normal distribution.
- It is giving you the upper tail probabilities!
- the probabilities are the label of the columns instead of
being inside the table.
20 of 79 Example: Find t* (the critical value).
a) The t-distribution with 5 df has probability 0.05 to the
right of t*.
b) The t-distribution with 5 df and confidence level of 90%.
c) The one sample t statistic from an SRS with 20
observations has a probability of 0.9 to the left of t*.
A Confidence Interval for a Population Mean (when σ is
Assumptions for using the t-statistics:
- Independence Assumption. The data values should be
- Randomization Condition: The data arise from a random
sample or suitably randomized experiment. Randomly
sampled data (particularly from an SRS) are ideal.
- 10% Condition: When a sample is drawn without
replacement, the sample should be no more than 10% of the
21 of 79 Normal Population Assumption:
- We can never be certain that the data are from a population
that follows a Normal model, but we can check the Nearly
Normal Condition: The data come from a distribution that
is unimodal and symmetric.
- Check by making a histogram or Normal probability plot.
Nearly Normal Condition:
- The smaller the sample size (n < 15 or so), the more closely
the data should follow a Normal model.
- For moderate sample sizes (15 ≤ n ≤ 40 or so), the t works
well as long as the data are unimodal and reasonably
- For large sample sizes (n > 40 or 50), the t methods are safe
to use unless the data are extremely skewed.
A confidence interval for the population mean (when σ is
unknown) is given by
22 of 79 where t* is the critical value for the t distribution with df = n – 1
confidence level C. In other words, t* is the upper 2 critical
value for the t(n – 1) distribution.
When Gosset corrected the model for the extra uncertainty,
the margin of error got bigger.
Your confidence intervals will be just a bit wider and
your P-values just a bit larger than they were with the
By using the t-model, you’ve compensated for the extra
variability in precisely the right way.
Example: (Using the battery lifetime example from Ch3)
We have a random sample of n = 4 observations on y = battery
lifetime (hrs): 5.9, 7.3, 6.6, 5.7
NOTE: y = 6.375, s = 0.7274 (calculated in Ch3)
Find the 95% confidence interval for the mean battery lifetime.
23 of 79 Example:
A scientist interested in monitoring chemical contaminants in
food, and thereby the accumulation of contaminants in human
diets, selected a random sample of n = 50 male adults. It was
found that the average daily intake of dairy products was y =
756grams with a standard deviation of s = 35grams.
Find a 95% confidence interval for the mean daily intake of
dairy products for men.
24 of 79 Example: IQ test scores
The SRS IQ test scores of 31 girls in Region A as follows:
113 102 105 … 95
This has a sample mean y 105.84 and a sample standard
deviation of s = 15. The shape of the population distribution is
unimodel and relatively symmetric.
a) Give a 99% confidence interval for the true mean IQ of
all girls in the district.
b) Give a 90% confidence interval for the true mean IQ of
all girls in the district.
25 of 79 c) If the sample mean of IQ test scores of 20 girls in Region
A is 105.84, give a 90% confidence interval for the true
mean IQ of all girls in the district.
Margin of error m t* s gets smaller when
- t* gets smaller, which is the same as smaller (1 – α). To
obtain a smaller margin of error, you must accept lower
- n gets larger. Increasing the sample size gives more
- gets smaller. The less inherent variation in the
population you are studying, the more accurate your
estimate will be.
NOTE: we can control t* and n, but we cannot control .
26 of 79 Example: (Please try it on your own)
Bank Mean Number of Phone
Suppose that each sample mean was based on an SRS of n = 50
working hours and that s = 5 is known.
a) Compute a 95% CI for A the true mean number of
phone calls per hour to Bank A.
With C = 95% and df = 49 (round down to 45), t* = 2.014.
y t* 15.6 2.014 15.61.424
We are 95% confident that A is between 14.176 and
17.024 phone calls per hour.
b) Compute a 95% for the other 2 banks.
95% CI for B:
y t* 11.9 2.014 11.91.424 (10.476,13.324)
95% CI for C:
y t* s 11.7 2.0145 11.7 1.424 (10.276,13.124)
27 of 79 c) A survey claims that Bank A receives more phone calls
than the other Banks. Based on the confidence intervals
from parts (a) and (b), do you agree?
The yvalue for Bank A is so large that its confidence
interval lies entirely to the right of all other CIs. Even
taking random variation into account, the number of phone
calls received is clearly larger than other Banks.
Example: A researcher found that a 98% confidence interval
for the mean hours per week spent studying by college students
was (13, 17). Which is true?
a) There is a 98% chance that the mean hours per week spent
studying by college students is between 13 and 17 hours
b)We are 98% confident that the mean hours per week
spent studying by college students is between 13 and 17
c) Students average between 13 and 17 hours per week
studying on 98% of the weeks
d)98% of all students spend between 13 and 17 hours
studying per week.
28 of 79 Ch 23 Inference about the Mean
Previously, population parameters were described, now we will
be checking if claims about the population parameters are true,
or plausible to a given degree.
A company is advertising that the mean lifetime of their light
bulbs is 1000 hours with standard deviation of 5 hours. A
person suspects the mean lifetime of the light bulbs is less than
1000 hours (company is lying in their advertisement), so he
picks a sample of 100 light bulbs and find the average lifetime
of these 100 light bulbs is y 998 .
Based on this result, can we state that:
i) the mean lifetime of this company’s light bulb, on average,
is less than 1000 hours (so this company is lying in their
ii) the difference between 1000 hours (the average lifetime for
the population) and 998 hours (the average lifetime for the
sample) may have occurred because of sampling
29 of 79 A hypothesis test is a method for using sample statistics to
decide between two competing claims on hypotheses about a
population parameter. It follows the following procedure:
1) Define the variable, the parameter(s) of interest, and any
2) State the null hypothesis H and alternative hypothesis H .
3) Gather the evidence (sample). Based on the data in the
sample, we will calculate a test statistic.
4) Assess the strength of the evidence against the null
hypothesis in favor of the alternative. This will be done by
5) Make a decision based on Step 4.
6) State the conclusion.
State the hypotheses:
The null hypothesis H is 0 claim about a population parameter
that is assumed to be true until it is declared false. It is generally
the hypothesis of “no effect.”
We usually write down the null hypothesis in the form H :
parameter = hypothesized value.
30 of 79 The alternative hypothesis H is aaclaim about a population
parameter that will be true ONLY when we reject the null
hypothesis. In another words, this is the hypothesis that we are
trying to find evidence for.
Common choices of hypotheses are:
- Two-tailed Test:
o H :0population characteristic = specific value versus
o H :apopulation characteristic ≠ specific value
- Upper-tailed Test:
o H :0population characteristic = specific value versus
o H :apopulation characteristic > specific value
- Lower-tailed Test:
o H : population characteristic = specific value versus
o H : population characteristic < specific value
- H : μ = 100 versus H : μ < 100
- H : p = 0.25 versus H : p ≠ 0.25
- We cannot test H : μ0= 100 versus H : μ >a150
- We cannot test H : 0 y 100 versus H :a y 100
31 of 79 Example:
You are considering moving to Richmond Hill, and are
concerned about the average one-way commute time to
downtown Toronto. Does the average one-way commute time
exceed 25 minutes? You take a random sample of 50 Richmond
Hill residents and find an average commute time of 29 minutes
with a standard deviation of 7 minutes. Which set of hypotheses
should you test?
A) H 0 μ = 25 vs H A μ > 25
B) H 0 μ = 25 vs H A μ < 25
C) H : μ = 29 vs H : μ > 29
D) H 0 μ = 25 vs H A μ ≠ 25
You want to see if the number of minutes cell phone users use
each month has changed from its mean of 120 minutes 2 years
ago. You take a random sample of 100 cell phone users and find
an average of 135 minutes used. Which set of hypotheses
should you test?
A) H : μ = 120 vs H : μ > 120
B) H 0 μ = 120 vs H A μ ≠ 120
C) H 0 μ = 120 vs H A μ < 120
D) H 0 μ = 135 vs H A μ ≠ 135
32 of 79 Example:
According to a June 2004 Gallup poll, 28% of Americans “said
there have been times in the last year when they haven’t been
able to afford medical care.” Is this proportion higher for black
Americans than for all Americans? In a random sample of 801
black Americans, 38% reported that there had been times in the
last year when they had not been able to afford medical care.
Which type of hypothesis test would you use?
A. One-tail upper tail
B. One-tail lower tail
D. Both A and B
A statistics professor wants to see if more than 80% of her
students enjoyed taking her class. At the end of the term, she
takes a random sample of students from her large class and asks,
in an anonymous survey, if the students enjoyed taking her class.
Which set of hypotheses should she test?
A. H 0 p < 0.80 H A p > 0.80
B. H 0 p = 0.80 H :Ap > 0.80
33 of 79 C. H 0 p > 0.80 H : A = 0.80
D. H 0 p = 0.80 H : A < 0.80
An online catalog company wants on-time delivery for 90% of
the orders they ship. They have been shipping orders via UPS
and FedEx but will switch to a new, cheaper delivery service
(ShipFast) unless there is evidence that this service cannot meet
the 90% on-time goal. As a test the company sends a random
sample of orders via ShipFast, and then makes follow-up phone
calls to see if these orders arrived on time. Which hypotheses
should they test?
A. H : p < 0.90 H : p > 0.90
B. H 0 p = 0.90 H : A > 0.90
C. H 0 p > 0.90 H : A = 0.90
D. H 0 p = 0.90 H : A < 0.90
Testing H v0. H : a
- H 0ill be rejected only if the sample evidence strongly
suggests that H i0 false.
- Otherwise H will not be rejected.
34 of 79 So there are two possible conclusions:
- reject H (0ccept H ) a
- do not reject H (W0en H is not0being rejected, it doesn't
mean strong support for H , b0t lack of strong evidence for
Note: these decisions are not symmetric, there is NO way you
can say you accept H . 0
Idea: Compare the process to a criminal trial.
The fact is that a person accused of a crime is either guilty or not
To prove someone is guilty, we start by assuming they
We retain that hypothesis until the facts make it
unlikely beyond a reasonable doubt.
Rejection of H : 0
Nonrejection of H : 0
35 of 79 Example (con’t):
A company is advertising that the average lifetime of their light
bulbs is 1000 hours with standard deviation of 5 hours. You
might question this, and want to show that in fact the lifetime is
shorter. State the hypothesis. Interpret rejection and
nonrejection of H for this example.
You would test:
Rejection of H : 0
Nonrejection of H : 0
How to make the decision (reject H or do 0ot reject H ) 0
The decision to reject, or not to reject H i0 based on information
contained in a sample drawn from the population of interest.
Use the sample to:
36 of 79 - Calculate a test statistic (a number that measures how many
standard deviations away the estimate in the sample is from
the hypothesized value of the parameter in H ),0
o Use the value of the test statistic and its distribution to
calculate the p-value (the probability of observing the value
of the test statistic as extreme or more extreme than the one
observed, if H 0s true). In other words, we try to find out
how likely the observed results could have happened if the
null hypothesis were true.
- When the data are consistent with the model from the null
hypothesis, the P-value is high and we fail to reject the null
- If the P-value is low enough, we’ll “reject the null
hypothesis,” since what we observed would be very
unlikely were the null model true.
H A parameter ≠ value (a two-sided alternative)
we are equally interested in deviations on either side
of the null hypothesis value.
37 of 79 For two-sided alternatives, the P-value is the
probability of deviating in either direction from the
null hypothesis value.
The other two alternative hypotheses are called one-sided
A one-sided alternative focuses on deviations from the
null hypothesis value in only one direction.
Thus, the P-value for one-sided alternatives is the
probability of deviating only in the direction of the
alternative away from the null hypothesis value.
P-Values and Decisions:
How small should the P-value be in order for you to reject
the null hypothesis?
It turns out that our decision criterion is context-dependent.
When we’re screening for a disease and want to be
sure we treat all those who are sick, we may be willing
to reject the null hypothesis of no disease with a fairly
38 of 79 A longstanding hypothesis, believed by many to be
true, needs stronger evidence (and a correspondingly
small P-value) to reject it.
Your conclusion about any null hypothesis should be
accompanied by the P-value of the test.
If possible, it should also include a confidence interval
for the parameter of interest.
Don’t just declare the null hypothesis rejected or not
Report the P-value to show the strength of the
evidence against the hypothesis.
This will let each reader decide whether or not to