false

Textbook Notes
(368,214)

Canada
(161,710)

University of Guelph
(12,243)

SOAN 3120
(37)

Michelle Dumas
(18)

Chapter 2

Unlock Document

Sociology and Anthropology

SOAN 3120

Michelle Dumas

Fall

Description

Chapter 2: Describing Distributions with Numbers
Measuring Center: The Mean
The most common measure of spread if the average, or the mean
To find the mean, add their values and divide by the number of observations
The capital Greek sigma is short for add them all up
o The subscripts on the observations are just a way of keeping the n
observations distinct
o The bar over the x indicates the mean of all the x-values
An important fact about the mean as a measure of center: it is sensitive to the
influence of a few extreme observations
o Because the mean cannot resist the influence of extreme observations,
we say that is not a resistant measure of center
The median is the formal version of the midpoint
o The median M is the midpoint of a distribution, the number such that
half the observations are smaller and the other half are larger
o The find the median:
1. Arrange all observations in order of size, from smallest to largest
2. If the number of observations n is odd, the median M is the center
observation in the ordered list. If the number of observations n is
even, the median M is midway between the two center observations in
the ordered list.
3. You can always locate the median in the ordered list of observations
by counting up (n+1)/2 observations from the start of the list.
Note the formula (n+1)/2 does not give the median just the location of the
median in the ordered list
Comparing the Mean and the Median
The median unlike the mean is resistant
The outlier just counts as one observation above the center, no matter how
far above the center it lies
The mean uses the actual value of each observation and so will chase a single
large observation upward
The mean and the median of a roughly symmetric distribution are close
together
If the distribution is exactly symmetric, the mean and median are exactly the
same
In a skewed distribution, the mean is usually farther out in the long tail than
is the median
Many economic variables have distributions that are skewed to the right
Reports about incomes and other strongly skewed distributions usually give
the median rather than the mean
The mean and the median measure the center in different ways, and both are
useful Measuring Spread: The Quartiles
The mean and the median do not tell the whole story
We are interested in the spread or variability
The simplest useful numerical description of a distribution requires both a
measure of center and a measure of spread
We can improve our descriptions of spread by also looking at the spread of
the middle half of the data
The quartiles mark out the middle half
To calculate the quartiles:
1. Arrange observations in increasing order and locate the median M in the
ordered list of observations
2. The first quartile Q1 is the median of the observations who position in the
order list is to the left of the location of the overall median
3. The third quartile Q3 is the median of the observations who position in the
order list is to the right of the location of the overall median
When there is an odd number of observations leave out the overall median
when you locate the quartiles in the ordered list
The quartiles are restraint because they are affected by a few extreme
observations
When the number of observations is even include all the observations when
you locate the quartiles
The Five-Number Summary and BoxPlots
The smallest and largest observations tell us little about the distributions as a
whole, but they give us information about the tails of the distribution that is
missing if we know only the median and the quartiles
To get a quick summary of both center and spread combine all five numbers
The five number summary of a distribution consists of the smallest
observation, the first quartile, the median, and the third quartile, and the
largest observation, written in order from smallest to largest
o Minimum Q1 m Q3 Maximum
The five number summary of a distribution leads to a new graph, the boxplot
The boxplot is a graph of the five number summary
o A central box spans the quartiles Q1 and Q3
o A line in the box that marks the median M
o Lines that extend from the box to the smallest and largest
observations
Boxplots show less detail than histograms or stemplots so they are best used
for side by side comparison of more than one distribution
When you first look at a boxplot locate the median, then look at the sp

More
Less
Related notes for SOAN 3120

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.