Class Notes (807,038)
Canada (492,560)
Statistics (266)
STAB22H3 (208)
Ken Butler (34)


13 Pages
Unlock Document

University of Toronto Scarborough
Ken Butler

STAB22 LEC03 (Covers: - missed portions of chapter 3 - remaining portion of chapter 4 ) ---------------[CHAPTER3]----------------- [11] CONTINGENCY TABLES - table that comprises of 2/more cvar's (ex) year, opinion [12] READING A CONTINGENCY TABLE - 2 cvar's - gender - status of offer ----------------Status of Offer: Gender: ACCEPTED REJECTED TOTAL MALE 490 210 700 FEMALES 280 220 500 TOTAL 770 430 1200 - 4 boxes rep. 4 combinations of males and females rejected/accepted - ie. 1) male, accepted - ie. 2) male, rejected - ie. 3) female, accepted - ie. 4) female, rejected - total column included to know what is out of what - each total column or row totals up for ONE categorical var. - ex. TOTAL row is totalling up "Status of Offer" - ex. TOTAL col. is totalling up "Gender' Questions - How many of the applicants were males who were rejected? - total: applicants = 1200 - val. of interest: 2) male, rejected = 210 - relative frequency: 210/1200 * 100% = 17.5% [13] PERCENTAGE OF TOTAL ---------------Status of Offer: Gender: ACCEPTED REJECTED TOTAL MALE 490 210 700 FEMALES 280 220 500 TOTAL 770 430 1200 Total Percent - in this case, we are dividing EVERYTHING by the bottom-right cell, then multiplying by 100% (ex) We get (Output from MyStatCrunch) Percentage of total = Joint Distribution - Questions asked about this will be worded like "out of all people who applied…. " - can realize that this is total percent by noticing that the bottom-right cell is 100% [14] JOINT DISTRIBUTION (aka TOTAL PERCENT) - dividing by everything; ie. the most bottom-right cell, which corresponds to TOTAL row, TOTAL column - cannot use to asnwer question like "are more males accepted than females? CONDITIONAL DISTRIBUTION  Row percent - dividing each row by its corresponding TOTAL value (ex) - this can answer the question: out of all males, how much % of them were accepted? - found that more % of males are accepted than % of females (ex) 70% of males were accepted 56% of females were accepted - can realize that this is row percent by noticing that the TOTAL column is all 100% [15]  Column percent - dividing each column by its corresponding TOTAL value - can realize that this is column percent by noticing that the TOTAL row is all 100% (ex) this much % of people accepted were males - this does not answer question that out of all males, were there more males accepted than amt of females out of all females (which is answered by row percent) We want: are there MORE males accepted than of the amount of FEMALES accepted? (so we want ROW percent to take into account all cases of one gender, and see from which gender were MORE % accepted than rejected) [16] DECIDING BETWEEN ROW AND COLUMN PERCENTS OUTCOME - is retrieved for those values that are not fixed - ex. 50% of people were males  is fixed, so this is not an outcome - ex. 50% of male applicants were rejected  this is an outcome => accepted/rejected are outcomes, which were, for this table, retrieved from doing Column Percents [17] Another example: AIRLINE PUNCTUALITY - outcome is either on-time or delayed, which is retrieved from column percents - after the were _____ , we want sth that is NON-FIXED. (ex) do NOT want: 66.2% of the flights on-time were America West this is an outcome - if flight is a certain airline, then we are stuck with that - ie. it is fixed. It can only be that type of airflight. However, if the result can differ, then that var. will give outcomes (ex. Flight Status: "On-time" or "Delayed") Note about Outcomes - apparently, what does NOT count as an outcome is data about things that are fixed (unchanging) (ex) of question about row percent - out of all flights on time, which belonged to America West? Observations from Column Percentage - America West: 87% on-time, 13% delayed - Alaska 56.7% on time, 18.27 delayed => Alaska is less punctual than America West - on-time, delayed are outcomes [18] THREE CATEGORICAL VARIABLES AND SIMPSON'S PARADOX - Although a contingency table can fit only 2 cvar's at a time, we have two contingency tables that share in common two variables (offer status, gender), but each have a third distinct variable: school => 3 categorical var's here - school - offer status - gender [19] PROFESSIONAL SCHOOLS Overall - more males (70%) accepted than females (56%) Law school - more females (33%) accepted than males (10%) Business school - more females (90%) accepted than males (80%) Implication - even tho. overall, more males accepted, if you look at both cases seperately (ie. law school contingency table, business school contingency table), can see that females are accepted at higher % [20] Why? Why do we get this answer? Observe the third variable: School Observe: Law school - 100 males applied, 10 got in - but a lot more females (300) applied to this, and 100 got in females - tend to apply to law school, where it is harder to get in -
More Less

Related notes for STAB22H3

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.