STAT1008 Lecture Notes - Lecture 3: Simple Random Sample, Statistical Inference
Sampling a population 1.2
● Sample Vs Population:
○ Population includes all individuals or objects of interest
○ Sample is all the cases that we have collected data on (subset of
population)
○ Statistical Inference is the process of using data form a sample to gain
information about the population
○ Population -> sampling -> sample -> statistical inference -> population
○ Sampling Bias occurs when the method of selecting a sample causes the
cample to differ from the population in some relevant way e.g. telephone
voting with Dewey vs Truman where the poll discovered that Dewey
sweeping Truman however Truman won since telephones was not the
correct way to receive data
○ If sampling bias exists, we cannot trust generalisations from the sample to
the population
● Can you avoid sampling bias?
○ Random samples:- Imagine putting the names of all the units of the
population into a hat, and drawing out names at random to be in the
sample
○ More often. We use technology
● Random Sampling:
○ Before the 2008 election, the Gallup Poll took a random sample of 2,847
Americans. 52% of those sampled supported Obama.
○ In actual election, 53% Obama
● Random vs Non-Random Sampling:
○ Random samples have averages that are centered around the correct
number
○ Non-random samples may suffer from sampling bias, and averages may
not be centered around the correct number
○ Only random samples can truly be trusted when making generalisations to
the population!
● Simple Random sample:
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Population includes all individuals or objects of interest. Sample is all the cases that we have collected data on (subset of population) Statistical inference is the process of using data form a sample to gain information about the population. Population -> sampling -> sample -> statistical inference -> population. If sampling bias exists, we cannot trust generalisations from the sample to the population. Random samples:- imagine putting the names of all the units of the population into a hat, and drawing out names at random to be in the sample. Before the 2008 election, the gallup poll took a random sample of 2,847. Random samples have averages that are centered around the correct number. Non-random samples may suffer from sampling bias, and averages may not be centered around the correct number. Only random samples can truly be trusted when making generalisations to the population!