CHAPTER 22 - COMPARING TWO PROPORTIONS

WHERE ARE WE GOING?

- comparing 2 prop's

- want to see whether 2 groups are diff, or do they vary by chance

main text, p585

[1]

(EX- MASSA Drivers)

WHO

- 6971 male drivers

WHAT

- seatbelt use

WHY

- highway safety

WHEN

- 2007

WHERE

- Massachusetts

FINDINGS

- n = 161 loc's in Massachussets, using SRS

- F drivers wore belt >70% of the time, regardless of gender of passenger(s)

- out of 4,208 M drivers w/ F passengers, 2777 (66%) wore belts

- out of 2,763 M drivers w/ M passengers, only 1363 (49.3%) wore belts

p586

[2]

WHY COMPARE BETWEEN TWO PROPORTIONS (aka PERCENTAGES)?

- interested in finding out how 2 groups differ

(ex) is exptl treatment better than placebo?

ANOTHER RULER

[1]

(EX- MASSA Drivers)

- know: diff. in prop's of men wearing sealtbelts from sample

- 66% - 49.3% = 16.7%

- more interested in: true difference for ALL men?

- it is not likely that the diff. we obtained is the truth, b/c prop's will vary from sample to sample

- to do this, req. a new ruler: SD of samplign distribution model for diff. in prop's

[2]

The variance of sum or diff. of 2 indep. random var's is sum of their variances

=> aka for indep. random var's, variances always add (regardless of whether you are adding or

subtracting the 2 random var's)

[3]

WHY DOES VARIATION INCREASE, DESPITE SUBTRACTING TWO RANDOM

QUANTITIES?

(An.) - bowl of cereal

- cereal box claims that there is 16oz of cereal in it

- this is not exact: b/c there is small variation from box to box

- when portion of cereal is poured into bowl (we want 2oz serving), we know that it will not be

exact; there is variation assoc. w/ this too

- qn: how much cereal is left in the box?

- is the guess more closer to guess of full box?

- AFTER cereal is poured into bowl, amt of cereal in box still remains a random quantity (but

smaller mean now), BUT it is even more variable b/c of additional variation in amt that was

poured out

[4]

(An. cereal)

- variance in amt of cereal remaining in box = sum of 2 variances

- becomes more variable, now that it has been distributed into two containers

[5]

- this formula for SD ONLY works for INDEPENDENT RANDOM VARIABLES

=> must check for independence b4 using it

p587

