Class Notes (837,548)
Canada (510,312)
Sociology (4,081)
SOC350H5 (9)
Lecture 6

SOC350H5 Lecture 6: Lecture 6

4 Pages
Unlock Document

David Pettinicchio

Lecture 6 Analysis of Categorical Data and Logistic Regression NOTE – perform scatter plot  Non interval ratio data – how do you describe categorical data  How do you build a model with dep variable is not interval ratio  Logistic regression  When you deal with categorical data – shapes what you can do with that data Bivariate table/Cross tabulation  You can describe single variable using mean/media  And also describe relationship between categorical (x/y) variable  Cross tabulation – tells you how attributes of one variable intersect with attributes of another  All bivariate tables have rows (y variable) and columns (x variable)  Eg. Homicide and society of joiners  Cross tab – variable indep is always column  Some relationship between the two shown in table  Both categorical data  24 countries are low in joiners and homicide – 24 countries intersect there  If you produce a cross tab, be careful of where percent come from (divide value by total in column times 100) – cell amount/total in column Interpreting percent  Cross tab is like scatter plot – you can eyeball distribution in a scatterplot  When you look at cross tab – you can get some sense of variation  How do you know indep matters in dep – if values don’t change across rows its not imp  How is that data clustered?  Where are the data clustered – intersecting in same order pairs?  Low and low – same ordered pairs  Where is by data situated  Eyeballing table is problematic  If everything was 50% in all cell – that’s no relationship  How strong of a relationship is that – cant tell by cross tab  Lambda, gamma, chi sq – all helpful  These are measures of association  Lambda – nominal  Gamma for ordinal  They are related to r2 because they are based on proportional reduction error  They all refer to the same error – when we add variable are we reducing error in explaining variation (outcome)  Like pearson correlation efficient – PRE ranges from -1 and +1  No relationship = 0  +1 = perfect positive  -1 = perfect negative  When indep is ignored  Total amount of error when we don’t have x minus error based on when x is present  This formula is looking at errors in relation to each other  Want error with x to be as small as possible Abortion  Opposite of error is mode  What doesn’t fit hyp is error – not in mode  E1 when you don’t account for x is 84  And E2 when you account for x is smaller – 72  The smaller e2 is the greater PRE will be  Value of .14 is saying something imp about relationship  Positive – more children they have, more likely to support abortion  But number close to 0 – it’s a fairly week relationship  .5 is moderate  Smaller e2 would make relationship better Protest Participation  A lot of variables are yes and no – binary  Cannot use ols  Cant interpret binary as – every one unit increase in x change ___ in y  There is a way around it – tal
More Less

Related notes for SOC350H5

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.