Inference for Two-Way Tables

Contingency Table Test for Independence:

H0: The two criteria are independent of each other.

H1: The two criteria are not independent of each other.

**OR**

H0: p1 = p2 = p3 = … = pi (The populations are homogeneous).

H1: Not all p’s are equal (The populations are not homogeneous).

Notation:

RTi = Row total for row i.

CTj = Column total for column j.

NR = Number of rows.

NC = Number of columns.

Oij = Observed frequency for (i, j) cell.

Eij = Expected frequency for (i, j) cell.

n = Sample size.

2

Smoking Example:

A teacher at McGill University claims that the amount of smoking done by undergraduate

students depends on the year of study. A random sample of 200 students registered in the BCom

program was surveyed with the following results:

YEAR

1 2 3

HEAVY

24

27

32

83

LIGHT

14

47

17

78

NON-SMOKER

7

22

10

39

45 96 59 200

Test the teacher’s claim at the 5% level of significance.

Procedure:

Assuming that the null hypothesis is true, calculate the expected frequencies:

Eij = P(an observation is found in i-j cell)*n = P(Joint)*n.

Recall that the following property of statistical independence should be true under H0:

P(A B) = P(A)*P(B).

From this property:

Eij = P(Row)*P(Column)*n = (RTi / n)*(CTj / n)*n =

RT CT

n

i j

.

TS:

(O E )

E

ij ij 2

ij

.

AL: 2(; df) where df = (NR - 1)(NC - 1).

DR: Conclude H0 if TS AL and Conclude H1 if TS > AL.

