# OMIS 2010 Chapter Notes - Chapter 04: Squared Deviations From The Mean, Standard Deviation, Frequency Distribution

26 views29 pages

Published on 6 Oct 2011

Department

Operations Management and Information System

Course

OMIS 2010

Professor

Chapter 4: Numerical Descriptive Measures

4.1 Introduction

This chapter discussed numerical descriptive measures used to summarize and describe sets of data.

At the completion of this chapter, you are expected to know the following:

1. How to calculate the basic numerical measures of central location and dispersion.

2. How to use the Empirical Rule and Chebyshev’s theorem to interpret standard deviation.

3. How to calculate quartiles, and use them to construct a box plot.

4. How to approximate the mean and standard deviation of a set of grouped data.

5. How to calculate and interpret covariance and the coefficient of correlation.

6. How to calculate the coefficients b0 and b1 for the least squares (regression) line.

4.2 Measures of Central Location

This section discussed three commonly used numerical measures of the central, or average, value of

a data set: the mean, the median, and the mode. You are expected to know how to compute each of these

measures for a given data set. Moreover, you are expected to know the advantages and disadvantages of

each of these measures, as well as the type of data for which each is an appropriate measure.

Question: How do I determine which measure of central location should be used—the

mean, the median, or the mode?

Answer: If the data are qualitative, the only appropriate measure of central location is

the mode. If the data are ranked, the most appropriate measure of central loca-

tion is the median.

For quantitative data, however, it is possible to compute all three mea-

sures. Which measure you should use depends on your objective. The mean is

most popular because it is easy to compute and to interpret. (In particular, the

mean is generally the best measure of central location for purposes of statisti-

cal inference, as you’ll see in later chapters.) It has the disadvantage,

however, of being unduly influenced by a few very small or very large

measurements.

To avoid this influence, you might choose to use the median. This could

well be the case if the data consisted, for example, of salaries or of house

prices. The mode, representing the value occurring most frequently (or the

midpoint of the class with the largest frequency) should be used when the ob-

jective is to indicate the value (such as shirt size or house price) that is most

popular with consumers.

23

Example 4.1

Find the mean, mode, and median of the following sample of measurements:

8, 12, 6, 6, 10, 8, 4, 6

Solution

The mean value is

x

=

xi

i=1

n

∑

n =

xi

i=1

8

∑

8

=

8+12 +6+6+10

+

8

+

4

+

6

8=60

8 = 7.5

The mode is 6, because that is the value that occurs most frequently. To find the median, we must first

arrange the measurements in ascending order:

4, 6, 6, 6, 8, 8, 10, 12

Since the number of measurements is even, the median is the midpoint between the two middle values, 6

and 8. Thus, the median is 7.

Example 4.2

Consider the following sample of measurements, which is obtained from the sample in Example 4.1

by adding one extreme value, 21:

8, 12, 6, 6, 10, 8, 4, 6, 21

Which measure of central location is most affected by the addition of the single value?

Solution

The mean value is now

The mode is still 6.

We arrange the new sample of measurements in ascending order:

4, 6, 6, 6, 8, 8, 10, 12, 21

The median is now equal to 8, the middle value. Thus, the mean is the measure that is most affected by

the addition of one extreme value.

24

Example 4.3

In Example 2.2, we considered the following weights, in pounds, of a group of workers:

173 165 171 175 188

183 177 160 151 169

162 179 145 171 175

168 158 186 182 162

154 180 164 166 157

a) Find the mean of the weights of the sample of 25 workers.

b) Find the median of the weights.

c) Find the modal class of the frequency distribution of weights that was constructed in the solu-

tion to part b) of Example 2.2.

Solution

a) The mean of the 25 weights is

x =

xi

i=1

25

∑

25 =173 +183 +162 +...+175 +162 +177

25

=168.8 pounds

b) The middle value of the 25 weights is most easily found by referring to the stem and leaf dis-

play constructed in the solution to part a) of Example 2.2. We find that the median, or middle

value, is 169 pounds.

c) The modal class is the class with the largest frequency, which is “160 up to 170.” The mode

may be taken to be the midpoint of this class, which is 165 pounds.

Geometric Mean (Optional)

The arithmetic mean is the most popular measure of the central location of the distribution of a

set of observations. But the arithmetic mean is not a good measure of the average rate at which

a quantity grows over time. That quantity, whose growth rate (or rate of change) we wish to

measure, might be the total annual sales of a firm or the market value of an investment. The

geometric mean should be used to measure the average growth rate of the values of a variable

over time.

25