Lecture 3

# lecture three

SOC 222 -- MEASURING the SOCIAL WORLD
Session #3 -- MEANS & VARIATION
Sep 2013
Agenda:
Announcements
Where we are
Today’s Objectives: Know …
Terms to Know
Ratio (Quantitative) Variables- also known as quantitative measures
Frequency Distributions for Rat Vars
Frequency Distribution Histograms in SPSS
1. Skewed Distributions
2. Outliers
Central Tendency
Mode
Median
Mean
Comparing Measures of Central Tendency
SPSS: Central Tendency
Variation
1. Range
2. Variance
Standard Deviation (SD)
3. Inter-quartile range
Individual Uniqueness
Z scores
Normal Curve Distributions
CatRat Relationships
Comparing Means
Other Stuff in the Text
Tutorial Vote
Today’s Objectives: Know
1. How to get a histogram frequency distribution
2. Skewness and outliers
3. Three measures of central tendency, and pros and cons of each 4. How to get central tendency and variation measures with SPSS
5. Variation as distance from the mean
6. Three measures of variation
7. How to assess the uniqueness of a specific case with Z scores and percentiles
8. Difference between experimental and non-experimental designs
9. Comparing means as effect size for cat rat relationships
Terms to Know
quantitative variable
valid percent
normal curve
histogram
skewness
positive and negative skewness
outlier
mode, median, mean
bimodal distribution
variance, range, standard deviation, inter-quartile range
percentile
Z score
true experiment, quasi-experiment
RATIO (QUANTITATIVE) VARIABLES
• Counts – ex. University classes
• Amounts- ex. Cost of texts for each class
FREQUENCY DISTRIBUTIONS for RAT VARS
• Hard to use to find modes and medians
Reminder: note the difference between “Percent” and “Valid Percent”
• Need it because we have to get to shape of the distribution
• Distribution shape is important because
• normal curve
• This is a bell-shaped curve Frequency Distribution Histograms in SPSS
The graphic of a frequency distribution is called a histogram
- go to analyze and click on frequencies, put variable we want in box,(average percent
mark), go to charts and click on histogram- click on “show the normal curve”- and delete
the display frequency tables
Revised SPSS Guide is posted:
SPSS Frequency Distributions
Open data set
• Click on “Analyze” on menu bar
• Choose “Descriptive statistics” from dropdown menu
• Choose “Frequencies”
This opens a box called “Frequencies”
Expand the box to read the variable labels
• List of variables on the left
• In the middle: an empty working area called “Variable(s)”
• Option buttons on the right
• Action buttons on the bottom
Click on the variable you want
• Click on the arrow to move it to the Variables area
For a Frequency Distribution Table
• Click on OK
• This opens your output window with a frequency distribution table.
For a Frequency Distribution Histogram
• Click on “Charts…” option button
• Select Histogram, Normal curve
• Continue
• In main Frequencies box
• Uncheck “Display frequency tables”
• OK
• This opens your output window with a Histogram The variable has approximately a normal curve shape
• Missing some cases in the middle range
1. Skewed Distributions
These are distributions which stretch out in one direction We say this distribution is skewed.
• positively skewed.- not normal distribution
2. Outliers
• Outliers are extreme values – extreme values, noticeably different from the rest,
can distort conclusions you come up with • Shows an outlier at around 5.
CENTRAL TENDENCY
The underlying question:
• What’s the “typical value” of a variable?
There 3 common measures, mode medians and means
Texts:
• Linneman: 76-84
• Kranzler: 43-47 Mode
Mode: The value with the most cases
EG: Ages: 21, 21, 24, 25, 27, 28, 29
• The value “21” occurs twice
• All other ages occur just once
• The mode is “21”
NOTE:
• two modes
• bimodal
EG: 21, 21, 24, 25, 27, 27, 29
• This distribution is bimodal
Median
• Median: The value of the “middle” case
• When all cases are sorted in rank order
EG: 21, 24, 24, 26, 27
• median is “24”
• it’s in the middle
NOTES:
1. Only applies to ordinal and ratio variables
2. What if you have an even # of cases?
EG: 21, 21, 24, 25, 27, 28
• median is the mid-point between these two
• median here is 24.5 Mean
• Mean: The average value
EG: the distribution (of ages in a grad student seminar) is :
21, 21, 24, 25, 27, 27, 29
xi means the value of the ihcase.
EG: x is 24.
3
∑ (“sigma”) means sum
EG: ∑xi is 174
x
x= ∑ i
n
Interpretation:
• The x with the bar over it means the mean
• The equal sign tells us how to get the mean
• The sigma sign tells us to add something up
• The numerator (top of the fraction)
• x with the subscript i tells us to sum up all the values of x
• The denominator (bottom)
• n tells us to divide that sum
EG: mean of this set of values: 3, 5, 7, 9
3,5,7,9 24
x = ∑ = = 6
4 4 NOTES:
Comparing Measures of Central Tendency
Nominal category: can only use the mode
Ordinal category: can use mode and the median- best measure depends on kind of
question you are asking, if asking about what happens more often then its mode, if
asking about what’s in the middle, then report the median
Quantitative: can use any of the 3
• Mode: not very useful, typically question imply we want central value, not most
frequent
• Median
• Advantage of the median: not affected by outliers
• outliers (Linneman, p. 77)
EG: 21, 21, 24, 25, 27, 28, 29
• median is “25”
EG: 21, 21, 24, 25, 27, 28, 92
• median is still “25”
• Disadvantage of the median:
• Doesn’t take each score into account equally, scores at the ends
don’t count as much
• Mean
• Advantages of the mean:
1. it considers all other values
2. statistical formulas
3. mathematically at it has some useful properties
• Disadvantage of the mean:
1. outliers
2. skewed distributions How to get central tendency on SPSS
-analyze descriptive stats, hit stat button and and click on mean median and mode if
that’s what y

