Chapter 13: Understanding research results: statistical inference
How to use inferential statistics to evaluate sample data.
Null hypothesis vs. research hypothesis.
Probability in statistical inference, meaning of statistical significance.
T-test and the difference between one-tailed vs. two-tailed tests.
F-test, systematic variance and error variance.
Confidence interval and your data.
Type I and II errors.
Factors influencing probability of a type II error.
Reasons you may obtain non-significant results.
Power of statistical test.
Criteria for selecting appropriate statistical test.
Ways of describing results of study: descriptive statistics, graphing techniques. Can also use inferential
statistics to draw general conclusions.
Inferential statistics let researchers:
a) assess how confident they are that the results reflect what is true in the larger population.
b) assess likelihood that the findings will occur if study was repeated over and over.
Samples and populations
Research findings are often based on sample data, but we want to make statements about
populations. Want to see if the difference in sample means reflects a true difference in population
means. Make conclusions based on sample data.
Ex: in a survey, 57% prefers A; 43% prefers B. Results accurate within 3% points with 95%
Confidence level. This means: very (95%) confident that the entire population rather than a
sample, 60-54% prefers A; 46-40% prefers B. *But still have a 5% chance of being wrong.
In experimental designs, we know it is very important to ensure that the groups are equivalent.
However, samples will always have some difference in sample means: random/chance error.
So, difference in sample means show true difference in population mean + random error.
Inferential statistics give the probability that the difference between means reflects random error
rather than a real difference.
Null and research hypotheses
Begin statistical inference with statement of null and research/alternative hypothesis.
-Null (Ho): population means are equal, observed difference is due to random error. X had no effect.
-research (H1): population means are different. X had effect.
-If we can determine that H0 is wrong, then we accept H1 as correct. H0 is a precise statement
(population means exactly equal), such precision is not found in H1, so can reject H0when found a very
low probability that the results is due to random error statistical significance: significant result has
low probability of occurring if population means are equal. Low probability that the difference between
obtained sample means was due to random error. A matter of probability. Probability and sampling distributions
Probability: likelihood of occurrence of an event or outcome.
-Probability that an event (difference between means in the sample) will occur is there’s no difference in
the population. The probability of getting this result if only random error is operating. If this probability is
low, can reject the probability that only random or chance error is responsible for the difference in mean.
Probability: The case of ESP
ESP: extrasensory perception ability. Ex: a friend says he can guess the card (1/5) your thinking.
H0: only random error is operating. Chance of right is 1/5=20%.
H1: number of right answer shows more than random or chance guessing. *accepting H1 could
mean your friend has ESp ability, or cards were marked, you cued them.
Hence, getting 2 right in 10 trails, or 3 is still H0.
A significant result is one that is very unlikely if null hypothesis is correct. The decision rule is
determined prior to collecting the data “Alpha level.”
Alpha level: probability required for significance. Significant level use to decide if can reject H0.
Usually 0.05: 5/100 or less chance that the result were due to random error. Very unlikely, hence
Binomial distribution: probability distribution of getting the right answer in 10 trails.
Sampling distribution is based on assumption that H0 is true.
Getting 2 rights is consistent with H0, hence highest probability. If result is highly
unlikely, you can reject H0.
Sampling size: total observation.
More sample, more confident the outcome is different from H0 expectation. More accurate estimation of
true population value. Ex: 30/100 has lower probability than 3/10.
T and F test: first need to specify H0,H1, and alpha level (usually 0.05).
T-test: examine if 2 groups are significantly different. Ex: effect of a model on aggression.
F-test: examine if a difference exist among 3/more groups, or to evaluate results of factorial design. T-Test
Need to calculate t from obtained data, and evaluate t based on H0.
If obtained t has low probability of occurrence (0.05 or less), can reject H0.
T is a ratio of 2 aspects of the data: difference between group means and variability
Group difference: diff between obtained means. Under H0, should be 0.
T increases as the difference between obtained sample means increases.
Within-group variability: amount of variability of scores about mean. Indicate amount of random
error in your sample. Recall, s= standard deviation, s^2: variance. Which indicates how much
scores deviate from group mean.
Formula for t test for 2 groups with same numbers of participants in each:
Numerator: difference between means of the 2 groups; denominator: first divide variance
by number of subjects in the group, then added. Taking square root converts # from
squared (variance) to SD. Finally, getting t by mean difference /SD.
Ex: total sample size 20, group has 10 participant. Calculated t:
Can refer to a table of critical values of t, using significant level of 0.05 (alpha), critical value from
sampling distribution of t is 2.101. T values greater than or = to 2.101 has a 0.05 or less probability of
occurring under H0. Our obtained t is larger than the critical value, hence can reject H- and conclude the
difference in mean reflects true difference in population.
Degrees of freedom (df): needed to select critical value for the test.
When comparing two means, df=n1+n2 -2, or total participants numbers – number of groups.
In our example: df= 10+10-2=19. It is the number of scores free to vary once means are known.
Ex: mean is 6, and there are 5 scores in the group, df = 4. Meaning, once you have any 4 scores, the 5 is th
known because mean must remain 6.(5 scores – 1 group =4).
One-tailed vs Tw-tailed tests: Need to choose critical t for the situation either:
1) specified a direction of difference between the groups (group 1 greater than 2). *one tailed.
2) did not specify a predicted direction of difference (group 1 will differ from group 2). *2 tailed.
Ex: 2 tail: expect 0.00 most frequently, values >/< than 0 are less likely. Pick critical value of 2.101 of t,
with .05 sig level because direction of diff was not predicted. This critical value is the point beyond which
2.5% of + values and 2.5% of – values of t lie (totaling 0.05 from the two tails). If predicted a direction of
difference, critical value would been 1.734: 5% of values lie in only 1 tail of distribution. F-Test: the analysis of variance (more general and an extension of t-test).
Used for simplest design with one independent variable with two groups (then F=t^2)
Also used for complex designs: more than 2 levels of independent variable, or factorial
with 2+ independent variable.
F test is a ratio of 2 types of variance: systematic and error variance.
Systematic variance: deviation of group means from grand mean (mean of all individuals)
-Small when the diff between group means is small, increases with group mean diff.
Error variance: deviation of individual scores in each group from respective group means.
**Other terms for the two above may be “between group” and “within-group” vari