COMMERCE 291 – Lecture Notes – Jonathan Berkowitz (copyright, 2014)
Summary of Lectures 11 and 12
In Lecture 9 (textbook: Chapter 9) we discussed a set of rules for computing the mean,
the variance and the standard deviation of combinations of random variables.
Now we combine these rules with the normal distribution (Chapter 9) and learn
applications of the normal distribution and why it is so often.
Adding Normally Distributed Random Variables
-- If X and Y are random variables each with a normal distribution, X + Y also has a
normal distribution; that is, adding “normals” give a normal.
For example, if X is N(X ,X ), Y is Y(μY,σ ), and X and Y are independent, then:
X+Y is N(𝜇 + 𝜇 , 𝜎 + 𝜎 2), and X-Y is 𝜇 − 𝜇 , 𝜎 + 𝜎2 2).
𝑋 𝑌 𝑋 𝑌 𝑋 𝑌 𝑋 𝑌
Fred and Barney are playing in a golf tournament. Record-keeping from previous years
show that Fred’s scores are normally distributed with mean 110 and standard deviation
10, and that Barney’s scores are normally distributed with mean 100 and standard
deviation 8. They play independently. What is the probability that Fred will beat Barney?
Let X = Fred’s score and Y = Barney’s score. Remember that in golf, the low score wins.
Mean(X–Y) = 110 – 100 =10
Var(X–Y) = 10 +8 = 164
SD(X–Y) = √164 = 12.81
Pr(X–Y<0) = Pr(Z <12.81) since X–Y is normal
= Pr(Z < -0.78) = 0.2177
There is about a 22% chance that Fred wins.
1 Example 2.
Gross weights of 8 ounce boxes of cereal are normally distributed with mean 9.60
ounces and standard deviation 0.80 ounces. Boxes are packaged 24 per carton.
Weights of empty cartons are normally distributed with mean 24.00 ounces and standard
deviations 2.20 ounces. Find the chance of a filled carton having a weight between 250
and 260 ounces.
Solution: Let Xi= weight of the i cereal box; let T = (𝑋1+ 𝑋 2 ⋯+ 𝑋 ) 24
Then T is normal with mean = 24(9.60) = 230.4; standard deviation = 0.80 24 √ 3.92.
Let Y = weight of the empty carton, and let W = weight of the filled carton, so W = T + Y.
Since each X is normal and Y is normal then W is also normal.
The mean of W is 24(9.60)+24.00 = 254.4
To compute the standard deviation of W, first compute the variance (since you can’t add
standard deviations, remember?!)
Variance of W = 24(0.80) + 2.20 = 20.20 (assuming box weights and the empty carton
weight are independent).
So standard deviation of W is √ 20.20 = 4.49.
The probability of interest is:
250 − 254.4 260 − 254.4
Pr(250 < 𝑊 < 26)) = Pr( < 𝑍 < ) = Pr(−0.98 < 𝑍 < 1.25) = 0.73
NOTE: T = (𝑋 + 𝑋 + ⋯+ 𝑋 ) is not the same as T* = 24 (𝑋 ).
1 2 24 1
Although Mean(T) and Mean(T*) are both equal to 24(9.6), the variances are not the
same. Read this carefully: Var(T) = 24(0.80) but Var(T*) = 24 (0.80) . 2
Adding 24 random variables is not the same as taking 24 times one random variable.
Measuring your height 24 times is not the same as measuring 24 different people’s
height once each!
Back to the Binomial Distribution (briefly) – see text: chapter 9, section 6.
Let X be the count of “successes” in n independent trials where the probability of a
success on each trial is p. Then we say that X is a Binomial random variable (i.e. X has
a binomial probability distribution) with parameter p. 𝑛
The probability of getting k successes in n trials is given by: Pr(X=k) = ( )𝑝 (1 − 𝑝) 𝑛−𝑘.
But you don’t need to know this formula, because it is very cumbersome to use if n is not
a small number (say, less than 20), and in our applications (e.g. survey data) the sample
size will be considerably larger.
2 What you do need to know is that:
E(X) = = np
Var(X) = = 𝑛𝑝(1 − 𝑝)
SD(X) = = √ 𝑛𝑝(1 − 𝑝)
(See text for details, if you’re interested in the derivation)
Notation alert: The text uses q instead of 1–p.
Example 1: Consider 100 tosses of a fair coin. If X is the number of heads, the X is
Binomial with p = 0.5; so E(X) = 100(0.5) = 50 and SD(X)√=100(0.5)(0.5) = 5.
that is, we expect about 50 heads, plus or minus about 5 heads. From the Empirical (68-
95-99.7 Rule), there is 95% chance of getting between 40 and 60 heads in 100 tosses of
a fair coin.
The Normal Approximation to the Binomial
If X is Binomial and n is large enough then X is also approximately Normal, so we can
use the Normal distribution to approximate Binomial probabilities, as follows:
If X is BinomiaPr(𝑎 < 𝑋 < 𝑏) = Pr( < 𝑍 < )
Rule of Thumb: This approximation works well if np > 10 and nq > 10. Why? See text for
Note: Ignore the “continuity correction” – see text if you’re interested.
Example 2: Continuation of Example 1. In 100 tosses of a fair coin what is the probability
of getting between 35 and 65 heads?
35 − 50 65 − 50
Pr(35 < 𝑋 < 65) = Pr( < 𝑍 < ) = Pr −3 < 𝑍 < 3 )≈ 0.997
Example 3. A TV game show auditions contestants by using a 100 multiple choice
general knowledge quiz. Each question has 5 response choices, exactly one of which is
correct. You pass the audition if you get 30 or more correct answers. If you guess
randomly on every question, what is the probability of passing the audition?
Solution: X, the number of correct answers, is Binomial with p = 0.2 (i.e. 1 out of 5) and n
= 100. Passing the audition means that the X>30 (Don’t worry about X>30 vs. X≥30; the
difference is handled by the continuity correction which you can ignore.)
= np =100(0.2) = 20
= √ 𝑛𝑝𝑞 = √100 0.2 (0.8) = 4
Pr 𝑋 > 30 = Pr(𝑍 > 4 ) = Pr 𝑍 > 2.5 )= 0.0062 or 0.62%.
The chance of passing by guessing is very small, about 1 in 200!
3 Chapter 10: Sampling Distributions
Recall that a statistic is a quantity computed from sample information. A “statistic” is
used to “estimate” a population “parameter”.
If a statistic is based on a random sample, then the statistic is a random variable and so
it has its own probability distribution which we call a “sampling distribution.”
Sampling Distribution – the theoretical distribution (i.e. “histogram”) of the values taken
by a statistic if a large number of samples of the same size were draw from the same
population. That is, it is the distribution of all possible values of a statistics if there were
many, many, many repetitions of the sampling process!
This will be needed to say how much the sample statistic or estimate could be expected
to change from sample to sample.
We will figure out the sampling distribution of two popular statistics:
• the sample proportion: 𝑝̂ (for categorical variables) – Section 10.3
• the sample mean, 𝑥̅ (for quantitative variables) – Section 10.6
Sampling Distribution for a Proportion (used for categorical data)
Suppose we take repeated samples of size n (that is, repeated sets of n independent
observations) where each observation can be one of only two possible outcomes – let’s
call them, “success” and “failure”. For example, in an opinion poll a “success” could be a
“yes” vote and a “failure” a “no” vote).
For each sample, compute the sample proportion, 𝑝̂ = 𝑋, where X is the number of
"successes" in the sample. Earlier we learned that X has a Binomial distribution. Now
we are converting the “count” X into a proportion 𝑝̂.
Draw a histogram of all the 𝑝̂'s. The shape of this histogram is called the sampling
distribution o𝑝̂. It turns out that the shape is... approximately Normal! (This is not
surprising since we already learned that we can use a Normal distribution to
approximate a Binomial distribution.)
The centre of this histogram will be p, the true population proportion; i.e. Mean𝑝) = p.
The spread of this histogram will be: 𝑆𝐷 𝑝̂ = √ 𝑛 .
The text uses the notation 𝑞 = 1 − 𝑝, so we can also write this as 𝑆𝐷 𝑝̂ = √ 𝑝𝑞.
4 Thus, the sampling distribution for a proportion, 𝑝, if the values are a random
sample (i.e. independent) is approximately Normal with mean p and standard
deviation √ , if n is large enough.
Remember that p is the true proportion in the population (i.e. it is the value of the
parameter of interest here).
When we say the mean of 𝑝̂ is p, we are saying the Expected Value of 𝑝̂or long-run
average value of 𝑝̂is p. Hence 𝑝̂is an unbiased or fair estimate of p, and we can use 𝑝̂
to estimate p .
For example, if a survey of 1000 people shows that 630 answered Yes to a question of
interest, then𝑝̂= 630/1000 = 0.63 or 63%. That 63% is an estimate of the true
proportion of Yes responses in the entire population.
When we say that the standard deviation of 𝑝̂ i√ 𝑛 , we are saying that the typical
distance from 𝑝̂(your estimate) to p (the truth) is abou√ 𝑝𝑞.
When we say that the sampling distribution is Normal, we are saying that 𝑝̂will usually
be close to p and occasionally far away, and we can compute the probabilities of how
close or how far away using the normal curve.
Summary: Turn 𝑝̂ into Z by standardizing: 𝑍 =
Assumptions and Conditions
As always, the sample must be a RANDOM sample.
How large an n is "large enough"?
• 10% Condition: n should be no more than 10% of the population
• Success/Failure Condition: np > 10 and nq > 10.
5 Example 1 Revisited. What is the probability of getting between 40 and 60 heads in 100
tosses of a fair coin?
Solution: Previously we answered this by working with X, the count, where E(X) =
100(0.5) = 50 and SD(X) =√ 100(0.5)(0.5) = 5.
40 − 50 60 − 50
Pr(40 < 𝑋 < 60) = Pr( < 𝑍 < ) = Pr −2 < 𝑍 < 2 )≈ 0.95
Now we can turn this into a question about proportions: what is the probability of getting
a proportion of heads between 0.4 and 0.6 in 100 tosses of a fair coin. Once again, n =
100 and p = 0.5.
Pr(0.40 < 𝑝̂ < 0.60) = Pr 0.4 − 0.5< 𝑍 < 0.6 − 0.5 = Pr −2 < 𝑍 < 2 ≈ 0.95
( √ 100 √ 100 )
Example 3 Revisited (see above)
Passing the audition means that the proportion of correct answers must be greater than
or equal to 0.30 (i.e. 30 or more