SOC 222 -- MEASURING the SOCIAL WORLD
Session #7 -- INF STAT: TABLES October 24, 2013
1. Know the difference between confidence intervals and statistical significance
2. Know the difference between a Type I and a Type II error
3. Know the meaning of “expected frequencies” and their role in chi-square
4. Know the commonly used levels of statistical significance
5. Know how to run a chi-square test on SPSS
6. Know the meaning of “degrees of freedom”
Terms to Know
type I error
type II error
observed frequencies (counts)
degrees of freedom
• 2 category variables: either nominal (such as religion) or ordinal (logical ordered
• Percentage Difference: measuring effect sizes.
• Problem: we do not know what is the population after drawing a sample.
REFRESHER: THE TWO INFERENTIAL STATISTICS PROCEDURES
Procedure #1: Confidence Intervals
• Question: How much confidence do we have in our estimate of the effect size
in the population?
→ Estimate the population correlation. For example: 0.35.
• Answer: a confidence interval around a population estimate. But only
looked at one variable.
• Typically 95%
• We’re 95% sure that the population estimate falls within this
interval • If we drew all possible samples of size N, 95% of them would
give a population estimate within this interval
Procedure #2: Statistical Significance
• Question: Is there a relationship in the population?
• Question is different here than the first procedure. Asking whether it exists.
• Does what we found in the sample exist in the population?
• Answer to this question: statistical significance of the relationship
WHAT IS STATISTICAL SIGNIFICANCE?
Linneman: chi-squared used as statistical test.
See Kranzler: 108-109
The Logic of Hypothesis Testing
1. Two groups are different
2. Two variables are related
→ Either of those two theories can lead us to research hypothesis.
With statistics, you can’t show something is true
• It is easier to disprove a statement; show that something is false.
• We create an opposite hypothesis since we cannot prove anything. We can only
• And we try to disprove that!
• NULL HYPOTHESIS
NOTE: If we keep doing research attempting to disprove the null hypothesis, it gives us
a little more confidence that our research hypothesis is right. KNOW ABOUT IT BUT
SET IT ASIDE; NOT ON THE TEST!
Linneman, p. 140
Kranzler, p. 105-106
• Null hypotheses state that nothing is happening
• In particular, any differences or relationship you find in your sample is
purely by chance. That’s just a sampling error. If we can disprove the null hypothesis, they we have support for our research
Type I Error
• We find in our sample that X Y
• Could the sample relationship occur just because it’s a non-representative
• Ex: female students are more likely to have jobs than male students. Could it be
just a sampling error?
We conclude there’s a relationship in the population when there really isn’t.
TYPE 1 ERROR
→ Sample is misleading us.
→ We are saying something is happening in the population when in fact it doesn’t.
→ It is almost always the MOST IMPORTANT!
HOW DO WE DO IT?
• We estimate the probability of making a type 1 error
If this probability is LOW,
• We say the relationship has statistical significance because probability of type 1
error is low. Hence, it exists in the population.
In most research, we want to minimize type I error.
Type II Error
• Concluding no relationship in the population when there really is one. It is the
opposite of type 1 error.
• Normally this is less important.
• The cost of missing a relationship is less than the cost of drawing a relationship
when really there isn’t one.
INTERESTING FACT: the smaller the chance of a type I error, the larger the chance of
a type II error
→ You run the risk of missing something important if you consider type 1 error than
type 2 error. Find balancing act between the two.
STATISTICAL SIGNIFICANCE FROM CHI-SQUARE CHI-SQUARED TEST!
What is Chi-Square?
1. Chi-square is a statistic
• Calculated from your sample
• Remember: anything you calculate from your sample is a STATISTIC. (Ex: mean,
hours you study, etc)
2. Chi-square is calculated from crosstabs
• The chi-square statistic compares the expected frequencies with the
• Comes up with a chi-squared number that tells us how similar they are.
• The less similar they are the bigger the value of the chi-squared. The
bigger the chi-squared the less similarity between the expected frequencies
and observed frequencies.
3. We know the shape of the sampling distribution for chi-square
• Not symmetrical; CHI-SQUARED CURVE!
• Changes with the size of the crosstab
• Number of rows and columns gives you the size of the crosstab