The case-control study is implemented using samples based on the DEPENDENT
It has Backward directionality: Arguing from knowing the dependent variable
(disease) to assessing the Independent (exposure status).
The design consists of a sample of those with the disease (Cases) , and at least one
other sample of those without the disease (Controls).
Eg illness study).
The design fixes as known, the total sample size, and the marginal distribution for
Disease: a+c, and b+d.
Timing is retrospective, since both disease and exposure occurred in the past, prior to
The design is observational in that no interventions or randomization to treatments
The design allows us to estimate:
the Marginal distribution of Exposure The two Conditional distributions of exposure based on disease status.
The design measures the prevalence of exposure in each group: Cases, and Controls.
Case Control study
The prevalence of disease cannot be estimated as the sampling design pre-determined
how many diseased (Cases) and non-diseased (Controls) were to be studied.
The quantities known at the start are: a+c, b+d and the total sample size n.
Conditional Distributions of Exposure
Given a Disease Status
Conditional Distribution of Exposure given disease present
P(E|D) = a ÷(a+c)
P(NE|D) = c ÷(a+c)
Conditional Distribution of Exposure given not diseased
P(E|ND) = b ÷(b+d)
P(NE|ND) = d ÷(b+d)
Measures of Association
Exposure Difference = P(E|D) – P(E|ND)
If disease and exposure are not associated the difference will be close to zero. If the
difference is positive the association is deemed positive (diseased have a higher
proportion exposed) and if it is negative the association is negative: diseased have a
lower prevalence of exposure.
Again, a difference measure is mostly useful when the prevalence estimates of exposure
are both say >0.20.
Measure of Association
Odds Ratio of Exposre Odds ratio of exposure given disease
OR(E|D) = (a:c) ÷ (b:d)
= (a÷c) ÷ (b÷d)
a:d => odds of exposure in diseased group.
b:d => odds of exposure in non-diseased
This is the preferred measure of association for this design
The odds of a case having the exposure is estimated as
Odds(Exp | D) = 33÷142 = 0.2324
The odds for a control having the exposure is estimated as
Odds(Exp | ND) = 26÷197 = 0.1310
The odds ratio is simply the ratio of the two odds just calculated.
In this case: the odds of exposure if diseased divided by the odds of exposure if a
Odds ratio = (33/142)÷(26/197)
= 1.75 Case-Control Study
We interpret this by saying that the odds of exposur