Statistical Sciences 2244A/B Lecture Notes - Lecture 1: Statistic, Statistical Parameter, Selection Bias
Stats 2244
Lec 2
Sampling designs
What is statistics?
- Statistics is the science of data
- Methods for planning experiments, obtaining data and then organizing, summarizing, analyzing,
interpreting, presenting and drawing conclusions based on the data
All… or some?
- Population: complete collection of all objects/individuals to be studied
- Sample: subset of units from the population from which data are collected
- Population of interest is the population that we are interested in
o Population and population of interest is different
- Why would we want to sample?
o Sometimes its not practical – population of interest could be very large
o There could not be enough resources to collect data from the entire population
o Maybe theres not enough time
What makes a good sample?
- we want a sample to be representative
o its characteristics have to be true or reflective of the population of interest
o we want to avoid bias (favouritism of segments of the population)
o the way to do this is to use a random sampling design
- we want a large sample size
o need to be large enough to account for variability in the population
Population of word lengths
- parameter is a characteristic of a population
- when we start talking about population means, mui is the symbol used for the population means
- N= population size (total number of subjects of interest)
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Methods for planning experiments, obtaining data and then organizing, summarizing, analyzing, interpreting, presenting and drawing conclusions based on the data. Population: complete collection of all objects/individuals to be studied. Population of interest is the population that we are interested in. Sample: subset of units from the population from which data are collected. We want a large sample size: need to be large enough to account for variability in the population. Parameter is a characteristic of a population. When we start talking about population means, mui is the symbol used for the population means. N= population size (total number of subjects of interest) Based on 192 sample means collected; impossible values and missing data removed. Selection bias: type of bias, systematic favouritism in the data selection process, leading to misleading results, we have to use sampling strategies that avoid selection bias. All possible combinations (ie samples) of size n from the population are equally likely.