Class Notes (836,216)
Canada (509,690)
Sociology (4,077)
SOC222H5 (93)

Lec#3 (Jan23rd).docx

15 Pages
Unlock Document

John Kervin

SOC 222 -- MEASURING the SOCIAL WORLD Session #3 -- MEANS & VARIATION Sep 2013 Agenda: Announcements Where we are Today’s Objectives: Know … Terms to Know Ratio (Quantitative) Variables Frequency Distributions for Rat Vars Frequency Distribution Histograms in SPSS 1. Skewed Distributions 2. Outliers Central Tendency Mode Median Mean Comparing Measures of Central Tendency SPSS: Central Tendency Variation 1. Range 2. Variance Standard Deviation (SD) 3. Inter-quartile range Individual Uniqueness Z scores Normal Curve Distributions CatRat Relationships Comparing Means Other Stuff in the Text Tutorial Vote Today’s Objectives: Know 1. How to get a histogram frequency distribution 2. Skewness and outliers 3. Three measures of central tendency, and pros and cons of each 4. How to get central tendency and variation measures with SPSS 5. Variation as distance from the mean 6. Three measures of variation 7. How to assess the uniqueness of a specific case with Z scores and percentiles 8. Difference between experimental and non-experimental designs 9. Comparing means as effect size for cat  rat relationships Terms to Know quantitative variable valid percent normal curve histogram skewness positive and negative skewness outlier mode, median, mean bimodal distribution variance, range, standard deviation, inter-quartile range percentile Z score true experiment, quasi-experiment RATIO (QUANTITATIVE) VARIABLES • Counts o Could be the total amount of students in the class • Amounts o University classes, the amount could be the total cost of the text for each class FREQUENCY DISTRIBUTIONS for RAT VARS - Frequency distributions for quantitative variables are not so useful, they end up with huge tables, which make it hard to find median and mode - Valid is on the total number of cases that actually answered. Percent is on the total cases including the missed ones. - Reminder: note the difference between “Percent” and “Valid Percent” - When we have quantitative variables we don’t use frequency tables, they are too long. We are concerned with the shape of the distribution, shape matters because it effects what we do with the data. Data analysis is easier in a normal distribution [bell-shaped curve] - We don’t want the tables we want the graphics, this graphic is called a histogram: it gives us the shape of the distribution. • Distribution shape is important because - Let’s us know what we are analyzing - Gives us an idea of the data - Makes it easy for data analysis • normal curve • This is a bell-shaped curve Frequency Distribution Histograms in SPSS The graphic of a frequency distribution is called a histogram Revised SPSS Guide is posted: SPSS Frequency Distributions Open data set • Click on “Analyze” on menu bar • Choose “Descriptive statistics” from dropdown menu • Choose “Frequencies” - Click on display frequency tables, so the table does not show This opens a box called “Frequencies” Expand the box to read the variable labels • List of variables on the left • In the middle: an empty working area called “Variable(s)” • Option buttons on the right • Action buttons on the bottom Click on the variable you want • Click on the arrow to move it to the Variables area For a Frequency Distribution Table • Click on OK • This opens your output window with a frequency distribution table. For a Frequency Distribution Histogram • Click on “Charts…” option button • Select Histogram, Normal curve • Continue • In main Frequencies box • Uncheck “Display frequency tables” • OK • This opens your output window with a Histogram The variable has approximately a normal curve shape • Missing some cases in the middle range • Big gap at 80, if these we moved over the distribution will be normal 1. Skewed Distributions These are distributions which stretch out in one direction - this is a skewed distribution, positively skewed. We say this distribution is skewed. • positively skewed. Since most of the data is on the left • Most of the data on the right means negatively skewed 2. Outliers • Outliers are extreme values • They are noticeably different from the rest of the values in the case • They can distort the conclusions you come up with. • Shows an outlier at around 5. • This distribution is clearly skewed, but it also has one outlier, only one case had a rate of 5 homicide race. CENTRAL TENDENCY The underlying question: • What’s the “typical value” of a variable? • Underlying question ^ • Three common measures: mode, median, mean • Today we will talk about mean, this is the one that matter (K:43-47) Texts: • Linneman: 76-84 • Kranzler: 43-47 Mode • Mode: The value with the most cases EG: Ages: 21, 21, 24, 25, 27, 28, 29 • The value “21” occurs twice • All other ages occur just once • The mode is “21”, since it occurs most often NOTE: • two modes • bimodal: when we have two modes EG: 21, 21, 24, 25, 27, 27, 29 • This distribution is bimodal • We have 2 modes ^. Median • Median: The value of the “middle” case • When all cases are sorted in rank order • If you put all your cases in rank order, the one that’s in the middle is the median. EG: 21, 24, 24, 26, 27 • median is “24” • it’s in the middle • ordinal and ratio variables, you cannot have a median in a nominal category variable NOTES: 1. Only applies to ordinal and ratio variables 2. What if you have an even # of cases? EG: 21, 21, 24, 25, 27, 28 • median is the mid-point between these two • median here is 24.5 • when you have two in the middle you take the midpoint, the sum of those 2 /by the pieces of data. Mean • Mean: The average value EG: the distribution (of ages in a grad student seminar) is : 21, 21, 24, 25, 27, 27, 29 xi means the value of the i thcase. EG: x 3s 24. ∑ (“sigma”) means sum EG: ∑x i is 174 x= xi ∑ n Interpretation: • The x with the bar over it means the mean • The equal sign tells us how to get the mean • The sigma sign tells us to add something up • The numerator (top of the fraction) • x with the subscript i tells us to sum up all the values of x • The denominator (bottom) • n tells us to divide that sum EG: mean of this set of values: 3, 5, 7, 9 x = ∑ 3,5,7,9 = 24 = 6 4 4 NOTES: - add up all the numbers and divide by the number of cases. Comparing Measures of Central Tendency Nominal category: - can only use the mode Ordinal category: - can use the mode and the median Quantitative: - can use any one of the three, but mode is not very useful. - We want the central value the most frequent value. - That leaves us with mean and median • Mode: o What happens most often and frequent • Median o When you need to the most common case, the middle • Advantage of the median: • Not effected by outliers. • outliers (Linneman, p. 77) EG: 21, 21, 24, 25, 27, 28, 29 • median is “25” EG: 21, 21, 24, 25, 27, 28, 92 • median is still “25” • Disadvantage of the median: • Doesn’t take each score into account equally • The scores at the end of the distribution don’t count as much, they will not count as much • Mean • Advantages of the mean: 1. it considers all other values
More Less

Related notes for SOC222H5

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.