The Empirical Rule
If the population distribution is bell-shaped, symmetrical with a single peak, with mean μ
and standard deviation σ, then
1. 68.26% of all population units are within (plus or minus) one standard deviation
of the mean and thus lie in the interval [μ −σ, μ + σ]
2. 95.44% of all population units are within (plus or minus) two standard deviations
of the mean and thus lie in the interval [μ −2σ, μ + 2σ]
3. 99.73% of all population units are within (plus or minus) three standard
deviations of the mean and thus lie in the interval [μ −3σ, μ + 3σ]
E.g. mean = 6.84 SD= 1.55
a) What percentage of scores are less than 3.74?
• Draw the graph and put the mean in the middle of the normal curve.
• Record the relative standard deviation values at each side of the mean, up to 3
standard deviations each side of the mean. 2.79, 3.74, 5.29, 6.84, 8.39, 11.39
(values from left to right)
• Solve for what you are being asked using the Empirical rule
1. An interval that contains a specified percentage of the individual measurements in
a population is called a tolerance interval
2. Often we interpret the three-sigma interval [μ ± 3σ] to be a tolerance interval that
contains almost all of the measurements in a normally distributed population
3. Of course, we usually do not know the true values of μ and σ.
• We must estimate the tolerance intervals by replacing μ and σ in these
intervals with the sample mean x and standard deviation s Back to Example 2.1
We saw that the distribution of cars going through the drive-through lane for Fast Food
Chain #1 was approximately normal. We also calculated
= 42.23 cars
s = 8.937 cars
We expect the interval (x-bar – s) and (x-bar +s) to contain 68.26 % of all the
When solving 1 standard deviation to each side of the mean we get 33.29 and 51.17.
If our observations comply with using the Empirical Rule, the graph is relatively bell –
shaped and normal and it therefore follows the Empirical Rule. What about the sample data for Fast Food Chain #2 (where the stem plot was skewed to
the right)? What are the tolerance intervals? How well does the data fit what is expected
by the empirical rule?
( x = 34.83 cars s = 12.88 cars) n=36
1 | 3
1 | 5 6
2 | 2 4 4
2 | 5 6 7 7 8 8 8 9 9
3 | 0 0 1 2 3 4
3 | 5 5 8 9
4 | 0 0 3
4 | 5 7 8
5 | 1 3
5 | 7
6 | 4
6 | 8
a) Test for one standard deviation each side of the mean (21.95, 47.71). Look at the
above graph, everything that’s greater than or equal to 22 and less than or equal to
47 are included in our observation.
That is 27 cars in that range (27/36) = 75% - this is a not normal distribution as we
are expecting something close to 68.26%, but get a number much higher.
If we test for the 2 sigma and 3-sigma interval the graph might be relatively close
to the normal values percent’s (empirical rule) but discrepancies usually like in
the one-sigma interval. Application of Tolerance Intervals
Tolerance intervals are often used to determine whether customer requirements, or
manufacturing specifications, are being met.
• If a process is consistently able to produce output that meets customer requirements,
we say the process is capable
• It is common practice to conclude that a process (that is in statistical control) is
capable if the 3-sigma tolerance interval estimate, [ ± 3s] is within the specification
Factory XYZ has a machine that produces iron bars. A random sample of 35 iron bars
gave a mean of 110.8 cm and a standard deviation of 0.4 cm. Customers who buy iron
bars from this factory require them to not be too long or too short. They are satisfied if
the bars have a length somewhere between 109.5 to 112.5 cms.
Is the factory capable of meeting their customers specifications/requirements? What