Statistics – Chapter 3
Introduction:
- frequency distributions, graphs, and chart summarize overall shape of distribution of
scores in a comprehensive way
- need to show the average or typical case in the distribution – measures of central
tendency
- also how much variety or heterogeneity there is in distribution – measures of
dispersion
- three common measures of central tendency – mean (average score), mode (most
common score), median (middle case)
- they reduce data
- measures of dispersion provide information about variety, diversity, or heterogeneity
Nominal level measures:
- mode is the score that appears the most
- most useful when you want a “quick and easy” indicator at central tendency or when
you are working with nominal level variables
- some have no modes at all, or so many that it loses meaning
- with ordinal and interval – ratio data, mode may not be most common/typical
Table 3.2
- to conveniently measure dispersion of variable we can use a statistic based on the
mode called the variation ratio (v)
- the longer the proportion of cases in mode of a variable, the less the dispersion among
cases of that variable
fm = # of cases in mode n = total # of cases
Ordinal Level Measures:
- median is a measure of central tendency that represents the exact centre of a
distribution of scores
- cases must be places highest to lowest first
- then find centre
- when “n” is odd, find the middle case by adding 1 to “n” and then dividing the sum by 2
Example: 1 (added in), 2, 4, 6, 6, 8, 10, 10
6 + 6/2 = 6
Table 3.4
- the range (R) is the difference of interval between the highest score (H) and lowest
score (L) in a distribution, provides a quick and general notion of variability of variables
measured at either the ordinal-or-interval-ratio level - R will be misleading as a measure of dispersion if just are of these scores is either
exceptionally high or low there are called outliers
R = H – L
- the interquartile range (Q) is a type of range that avoids this problem
Q = Q˅3 - Q˅1
Example: 2,3,8,9
R = 9 – 2 = 7
Q = 3 + 8/2 = 5.5
Q˅1 = 2 + 3/2 = 2.5
Q˅3 = 8 + 9/2 = 8.5
Q˅3 - Q˅1 = 6
Table 3.5
- the range and interquartile range provide a measure of dispersion based on just two
scores from a set of data
Interval-Ratio-Level Measures:
- mean reports the average score of a distribution
= mean
= sum of all scores
= number of cases
- this gives mean of sample
- means of population:
= sum of all scores
= number of cases in the population Some Characteristics of the Mean:
- mean is center of any distribution of scores in the sense that all points around it cancel
out
- if you take each score, subtract the mean from it, and add all of the differences, it will
always be zero
- this indicated that the mean is a good descriptive measure of the centrality of scores
Least Squares Principle:
- if you square and sum these differences, the result sum will be less than the sum of
the squared difference between scores and any other point in the distribution
Table 3.7
- mean will always have a greater value than the median when it has really high scores
this
- vice versa when it has really low numbers
Figure 3.2 – 3.4
Variance and Standard Deviation:
- deviations are the distances between scores and means
- this quantity will increase in value as the scores increase in their variety or
heterogeneity
- the sum of the deviations is a logical basis for a statistic that measures the amount of
variety in a set of scores
- to eliminate the negative

More
Less