September 27th 2017

WEEK 3

• Sampling error is not making mistakes, but rather the variation that comes with taking a

sample

Key Ideas:

- Statistical inference among populations:

o Defining a population

o Selecting a sample from the population

o Taking measurements on the individual units

o Carry out statistical analysis, including descriptive statistics and inferential

statistics that we use to make some statement about the population itself or to

compare among different populations

Sampling

- The ai of a idea saplig proess is to selet a group of uits that’s a good

representation for the statistical population

- Ideal sampling process can be broken down into 4 components:

o Units have known and non-zero probability of being included in your sample

o Unbiased

▪ Bias is when your sample ha some systematic difference from the true

statistical population that your trying to find out about

▪ Some of your units are less likely to be compared then some of your

others

o Independent

▪ Selecting one sampling unit to be in your sample should not influence if

other sampling units should be included as well

o Each possible sample has equal chance of being selected

▪ Making sure the individuals are mixed properly and representatively and

that each possible sample has an equal chance of being selected

DEFINIIONS:

1. Selection unit: every unit in the statistical population must have a chance of being in

your sample

2. Bias: selection of units cannot inadvertently favour one outcome over another on

average

3. Independence: selection of one unit cannot influence the probability that another unit

is selected

4. Equal chance for all samples: every combination of units in must be possible in your

sample

A volunteer based sample is different form a simple random sample because:

- Some people may have no chance of being included

- Selection of people may be biased

- samples might not be independent

Sampling error

- not a mistake

- it is the variation form one random sample to another

- depending on the people in your subset your answer will be a little different and that is

sampling error

- important for being able to do statistical inference

- variation that comes about form sampling variation is the very tool that we us to make

stateets aout the thigs e hae’t see et to ake a iference about a larger

population

Observational studies

- the main goal of observational studies is to characterize something about a population

that already exists

- the major drawback to observational studies is that when you look at the characteristics

of a populatio relatioships that eerge or orrelatie ut the are’t ausal

o ou a see ho thigs tred ith eah other ut ost ofte ou a’t sa

anything about causation

- ofoudig ariales are ariales that ou hae’t osered ut are likel driving the

relationship between the variables that you have observed

- re these studies retrospective or prospective

o retrospective means they are looking back in time

▪ e) take all the people ho hae a ertai tpe of disease ad ask hat’s

oo aogst the that is’t oo aog other people ho

do’t hae the disease

▪ advantage to these studies is that they can be done in a snap shot of time

▪ disadvantage is that they are really prone to confounding errors – can

have correlations not cause

o prospective means they are looking forward in time

▪ start with an initial group of people who you think is a good

representation of the population

▪ as that group (cohort) ages you start to look at which ones start to get

diseases and ask if there is a relationship between their disease and some

of their features

▪ eause ou ko hat’s happeed oer tie ou hae a idea of hat

comes first and that helps you with causation

▪ not better then every study but much better then retrospective

▪ the disadvantage is that this can take a long time

o for many big issues people are using a combination of retrospective and

prospective studies

Designs for observational studies

