Sampling (Part 2): Sample Size & Sampling Error
All of Today’s Discussion
Involves statistics - (W&M ch 15, p. 387-389)
Assumes we have probability (random) samples. Generated through a simple random sampling
When dealing with non-random samples, it is difficult to talk about sample size or sampling error
Part 1: Sampling Error (and a “thought experiment”)
A. Estimating mean midterm grade from random sample of 10 midterm grades.
120 grades in a hat; with replacement; add and divide by ten for mean; suppose
sample mean = 75%
Is this an accurate estimate of population (class) mean?
No! (class mean was 70%).
Lesson: Sample means contain “random” error.
Another sample might be 5% lower, etc
B. How much random error is associated with our sample mean?
Generate more random samples of 10 (with replacement: select 10, put back; select 10…)
S1 = 75%; S2 = 65%... S1000 = 73%
If we plot these means, we’ll get a “sampling distribution” (distribution of sample means that are
randomly selected from the same population).
Things to note about sampling distribution:
The means from a “normal” distribution (bell shaped curve)
With large number of samples, the mean of the sampling distribution is the same as the mean of
Standard deviation of sampling distribution (how “spread out” means are) is a measure of how
much error is in the sample means (more spread, more error)
Q: What would happen if we chose 1000 samples of 20 (instead of 1000 samples of 10)?
We’d get another sampling distribution
Mean would be the same, but
Because the size of each sample is larger, the distribution will have a smaller standard deviation
Each sample mean will be closer to the population mean (indicating each mean has less
Can use a formula to estimate how much sampling error is associated with samples of any size:
Standard error of population mean (SEM) = standard deviation of population, divided by the square
root of the sample size (see p. 387)
Next 2 sides are on handout (#8 & 9) Part 2: Back to Reality
We don’t choose 1000’s of samples to study!
Instead, we choose one sample with one mean (and some “random error” – sample mean won’t
necessarily be the same as the population mean)
Can estimate t