Chapter 8 – Producing Data: Sampling
Population Vs Sample
• Population – in statistical study is the entire group of individuals about which we want info
• Sample – part of population from which we actually collect information. Use sample to draw
conclusion about entire population
• Sampling design – how to choose a sample from population
• Sample Survey
◦ first step: say what population we want to describe
◦ second step: say what we want to measure. Give definitions of our variable
◦ last step: sampling design
How to Sample Badly
• Easiest (but not the best) design chooses individuals close at hand
◦ ex) going to mall and asking passing people if they are employed
◦ Convenience Sample – sample selected by taking members of population that are easiest to
reach
▪ produce unrepresentable data
• Bias – systematically favours certain outcomes
◦ Ex) interviews at mall mostly overrepresent middle class people and underrepresent the
poor, also unexperienced interviewers will likely to choose those who dress well, friendly
• Voluntary response sample – people who choose themselves by responding to a broad appeal.
Are biased b/c people w/ strong opinions most likely respond
◦ a poll
Simple Random Samples (SRS)
• Interviewer makes the choice; personal choices produce bias
• Sample chosen by chance rules out both favouritism by sampler and self-selection by
respondents
◦ gives everybody a chance
• Simple Random Sample – of size n consists of n individuals from population chosen in such a
way that every set of n individuals has equal chance to be the sample actually selected
◦ ex) choosing sample size of 4, from a population size of 28
◦ choosing slip of paper from hat
◦ slow and inconvenient for big populations
▪ use the Simple Random Sample applet, you can randomize by using table of random
digits
• Table of Random Digits (Table B) – is a long string of digits from 0,1,2,3,4,5,6,7,8,9 w/ these
two properties
◦ each entry in table is equally likely to be any of the 10 digits 0 – 9
◦ entries are independent of each others. Knowledge of one part of table gives no information
about any other part.
• Two steps in using the table to choose SRS:
◦ Label – give each number of population a numerical label of the same length
◦ Table – Choose and SRS, read from Table B successive groups of digits of the length you
used as labels. Your sample contains the individuals whose label you find in the table ▪ read two digits along, and ignore groups that a

