3P15, Jan 22, The Normal Curve (5) and Measures of Central Tendency and Dispersion (6)
Measures of Central Tendency
• Another univariate (one variable) descriptive statistic.
• Summarizes information about the most typical, central, middle, or common scores of a
variable.
• Often used to generalize or compare values across populations.
o Might compare the mean income of Canada to the mean income of the U.S.A
o Might compare the mean income of men to the mean income of women
o Mode: most common score
o Median: middle score
o Mean: average score
o Report different types of information and are used in different situations
Three measures of central tendency
• Mode: The most common score.
• Median: The middle score.
• Mean: The average score
What are modes, medians, and means?
• Mode, median, and mean are three different statistics.
• They report three different kinds of information and will have the same value only in
certain specific situations (which we’ll learn about later with normal distributions).
The Mode
• The most common or frequent score.
• Not an arithmetic measure.
• Can be used with variables at all levels of measurement.
• Most often used with nominal level variables.
How to Find the Mode
• Option 1:
o Count the number of times each value occurs.
o The value with the highest count is the mode.
o Graphically, the mode will be the biggest pie slice, bar, or highest point of a line.
• Option 2: use a computer
o To find the mode, just look for the highest frequency in a frequency table.
How to Find the Mode Graphically
• The biggest slice of the pie is the mode. Sometimes it will be obvious, but sometimes it
is easier to look at the frequency chart Bimodality and Multimodality
• What happens when there is more than one most frequent value?
• When there are two modes, the variable is said to be bimodal.
• Usually, when there are more than two modes, the variable is described as multimodal.
Bimodality, an example (almost)
The Median
• The middle score of any variable.
• Cannot be used with nominal data.
• Can be used with ordinal data, though not always usefully.
• Can be used without restriction on interval/ratio data.
o Can’t look for the median with nominal data because they can be ordered in any
order (no inherent order).
o Not always a good representation of the data. Ex.: 1, 1, 1, 1, 3, 17, 19, 21, 23
o Sometimes it is. Ex.: 2, 2, 2, 2, 3, 4, 4, 4, 4
Finding the Median (for interval/ratio data)
• Arrange the cases from high to low (or low to high).
• Locate the middle case.
o If N is odd, the median is the value for the middle case.
o If N is even, the median is the average of the scores of the two middle cases.
N= number of data values.
Ex.: 2, 2, 2, 2, 3, 4, 4, 4, 4, 4
Median = 3.5 The Median, cont.
Age  (G)
Cumulative
Frequency Percent Valid Percent Percent
Valid 12 TO 14 YEARS 7410 5.5 5.5 5.5
15 TO 19 YEARS 11114 8.3 8.3 13.8
20 TO 24 YEARS 6687 5.0 5.0 18.8
25 TO 29 YEARS 8672 6.5 6.5 25.3
30 TO 34 YEARS 10155 7.6 7.6 32.8
35 TO 39 YEARS 10617 7.9 7.9 40.8
40 TO 44 YEARS 10378 7.7 7.7 48.5
45 TO 49 YEARS 10144 7.6 7.6 56.1
50 TO 54 YEARS 10936 8.2 8.2 64.2
55 TO 59 YEARS 10548 7.9 7.9 72.1
60 TO 64 YEARS 8845 6.6 6.6 78.7
65 TO 69 YEARS 8043 6.0 6.0 84.7
70 TO 74 YEARS 7479 5.6 5.6 90.3
75 TO 79 YEARS 6143 4.6 4.6 94.9
80 YEARS OR MORE 6901 5.1 5.1 100.0
Total 134072 100.0 100.0
• In large samples, the median (mode and mean) can be very difficult to calculate without
computers
Statistics
Age  (G)
N Valid 134072
Missing 0
Median 8.00
th
• It’s coding the group: 8 category down.
The Mean
• The most frequently used measure of central tendency.
• Should only be used with interval/ratio data.
• Also known as the arithmetic average.
X
X = ∑ i
N
• The sum of a set of scores divided by the total number of observations.
o Mean = sum of all the scores / how many scores you have
o There may not be anyone with your mean data; it’s a little bit abstracting from the
raw data
• An example: usual number of hours worked per week. Statistics
Total usual hrs. worked per week  (D)
N Valid 78585
Missing
55487
Mean 39.75
Minimum 1
Maximum 169
Symbols to describe theoretical distributions
• “SD” sometimes used for standard deviation.
Histograms as a graphing technique.
• Bar graph is used with categorical data (nominal or ordinal).
• Histograms should only be used with continuous data, but they are often used with
ordinal, or sometimes even nominal or interval data.
• Histograms often have score intervals given as opposed to raw data. (Ex. 1015 years of
age) • Should be no gap between each category bar, unlike bar graphs.
The Normal Curve: An introduction
• Much like theoretical probabilities, there are theoretical distributions to describe what the
histogram of a variable ‘should’ look like.
• The theoretical distribution is the same for many variables.
• It is therefore called the normal distribution.
• Understanding normal distributions require most of the material that we’ve learned so far
in this course.
What are the characteristics of a theoretical normal curve?
• Bellshaped.
• Unimodal.
• Symmetrical.
• Unskewed.
• Mode, Median, and Mean have the same value.
• A distribution of sample means resembles the theoretical normal curve as sample size
increases.
o It is ‘asymptotically normal’.
Most people have the value that lies in the middle; fewer people have
extreme scores
Same theory with means; ex. if you divided the class into smaller groups
and took the means of each of those groups, most of the means would be
in the middle again too
Unimodal: only one value that has the most scores
Unskewed: normal shape; not a ton of extreme scores scattered
As sample size increases, they are less affected by the individual scores.
Ex. If that tall person was in the group of 4 and you only took the mean of
that group, it is more likely to be affected.
Asymptotically normal: variable would have a normal distribution if you
took an infinite amount of samples over an infinite amount of time
What is asymptotic normality? Samples=1000, Population=Normal, N=200
Lines drawn at 2sd 1sd mean +1sd +2sd
.
4
.
3
n .
t
a
F
.
.
0
4 2 0 2 4
• Each sample is 200 people; take the average of the groups of 200 people 1000 times
Samples=5000, Population=Normal, N=200
Lines drawn at 2sd 1sd mean +1sd +2sd
.
3
.
n
t
a .
F
.
0
4 2 0 2 4
• 5000 times Samples=10000, Population=Normal, N=200
Lines drawn at 2sd 1sd mean +1sd +2sd
.
3
.
n
t
r .
F
.
0

More
Less