# MSCI 1000 Chapter Notes - Chapter 3: Sample Size Determination, Response Bias, Statistical Parameter

CHAPTER 3 – Surveys and Sampling

Three Ideas of Sampling:

1. Examine a part of the whole

- Sample – a selected representation of the population

- Sample survey – designed to ask questions of a small group in the hope of learning

something about the entire population

- Biased samples – summary characteristics of the sample differ from the corresponding

characteristics of the population it is trying to represent

- Bias – any systematic failure of a sampling method to represent its population

- most representative sample – choosing participants at random

2. Randomize

- Randomizing protects us by giving us a representative sample even for effects we were

unaware of; makes sure that the average sample looks like the rest of the population

- Randomization – a defence against bias in the sample selection process, in which each

individual is given a fair, random chance of selection

- Fairness of randomization:

- Nobody can guess the outcome before it happens

- Some underlying set of outcome will be equally likely

- Pseudorandom – a random computer generation, but still follows a program so is not

completely random

- Sampling error (sampling variability) – the natural tendency of randomly drawn

samples to differ from one another

- error = difference, deviation (does not mean “mistake”)

3. The Sample Size is what matters

- Size of the sample – the number of individuals in a sample

- determines what we can conclude for the data, regardless of the size of the

population

**size of population and fraction of the population sampled is irrelevant

- Sample size – determines the balance between how well the survey can measure

the population and how much the survey costs

- The variance (or amount of categories) within the population determines the size of the

sample ! representative sample

Census – an attempt to collect data on the entire population of interest

- might not provide best information about the population, because:

i. It can be difficult, impractical and costly to complete a census

ii. The population we are studying might change (a sample generated in shorter time

period may generate more accurate information

iii. Taking a census can be cumbersome – ex. correct addresses (multiple addresses,

“primary residence”, avoiding doubling-counting)

Populations and Parameters

- Models use mathematics to represent reality. The numbers in these models are parameters

- Parameters – a numerically valued attribute of a model for a population. We rarely expect to

know the value for a parameter, but we do hope to estimate it from sampled data

- Population parameter – a numerically valued attribute of a model for a population

## Document Summary

Three ideas of sampling: examine a part of the whole. Sample a selected representation of the population. Sample survey designed to ask questions of a small group in the hope of learning something about the entire population. Biased samples summary characteristics of the sample differ from the corresponding characteristics of the population it is trying to represent. Bias any systematic failure of a sampling method to represent its population. Most representative sample choosing participants at random: randomize. Randomizing protects us by giving us a representative sample even for effects we were unaware of; makes sure that the average sample looks like the rest of the population. Randomization a defence against bias in the sample selection process, in which each individual is given a fair, random chance of selection. Nobody can guess the outcome before it happens. Some underlying set of outcome will be equally likely. Pseudorandom a random computer generation, but still follows a program so is not completely random.