Class Notes (837,548)
Sociology (4,081)
SOC350H5 (9)
Lecture 6

# SOC350H5 Lecture 6: Lecture 6 Premium

4 Pages
52 Views

Department
Sociology
Course
SOC350H5
Professor
David Pettinicchio
Semester
Winter

Description
Lecture 6 Analysis of Categorical Data and Logistic Regression NOTE – perform scatter plot  Non interval ratio data – how do you describe categorical data  How do you build a model with dep variable is not interval ratio  Logistic regression  When you deal with categorical data – shapes what you can do with that data Bivariate table/Cross tabulation  You can describe single variable using mean/media  And also describe relationship between categorical (x/y) variable  Cross tabulation – tells you how attributes of one variable intersect with attributes of another  All bivariate tables have rows (y variable) and columns (x variable)  Eg. Homicide and society of joiners  Cross tab – variable indep is always column  Some relationship between the two shown in table  Both categorical data  24 countries are low in joiners and homicide – 24 countries intersect there  If you produce a cross tab, be careful of where percent come from (divide value by total in column times 100) – cell amount/total in column Interpreting percent  Cross tab is like scatter plot – you can eyeball distribution in a scatterplot  When you look at cross tab – you can get some sense of variation  How do you know indep matters in dep – if values don’t change across rows its not imp  How is that data clustered?  Where are the data clustered – intersecting in same order pairs?  Low and low – same ordered pairs  Where is by data situated  Eyeballing table is problematic  If everything was 50% in all cell – that’s no relationship  How strong of a relationship is that – cant tell by cross tab  Lambda, gamma, chi sq – all helpful  These are measures of association  Lambda – nominal  Gamma for ordinal  They are related to r2 because they are based on proportional reduction error  They all refer to the same error – when we add variable are we reducing error in explaining variation (outcome)  Like pearson correlation efficient – PRE ranges from -1 and +1  No relationship = 0  +1 = perfect positive  -1 = perfect negative  When indep is ignored  Total amount of error when we don’t have x minus error based on when x is present  This formula is looking at errors in relation to each other  Want error with x to be as small as possible Abortion  Opposite of error is mode  What doesn’t fit hyp is error – not in mode  E1 when you don’t account for x is 84  And E2 when you account for x is smaller – 72  The smaller e2 is the greater PRE will be  Value of .14 is saying something imp about relationship  Positive – more children they have, more likely to support abortion  But number close to 0 – it’s a fairly week relationship  .5 is moderate  Smaller e2 would make relationship better Protest Participation  A lot of variables are yes and no – binary  Cannot use ols  Cant interpret binary as – every one unit increase in x change ___ in y  There is a way around it – tal
More Less

Related notes for SOC350H5
Me

OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Join to view

OR

By registering, I agree to the Terms and Privacy Policies
Just a few more details

So we can recommend you notes for your school.