Class Notes (837,548)
Canada (510,312)
Sociology (4,081)
SOC350H5 (9)
Lecture 6

SOC350H5 Lecture 6: Lecture 6
Premium

4 Pages
52 Views
Unlock Document

Department
Sociology
Course
SOC350H5
Professor
David Pettinicchio
Semester
Winter

Description
Lecture 6 Analysis of Categorical Data and Logistic Regression NOTE – perform scatter plot  Non interval ratio data – how do you describe categorical data  How do you build a model with dep variable is not interval ratio  Logistic regression  When you deal with categorical data – shapes what you can do with that data Bivariate table/Cross tabulation  You can describe single variable using mean/media  And also describe relationship between categorical (x/y) variable  Cross tabulation – tells you how attributes of one variable intersect with attributes of another  All bivariate tables have rows (y variable) and columns (x variable)  Eg. Homicide and society of joiners  Cross tab – variable indep is always column  Some relationship between the two shown in table  Both categorical data  24 countries are low in joiners and homicide – 24 countries intersect there  If you produce a cross tab, be careful of where percent come from (divide value by total in column times 100) – cell amount/total in column Interpreting percent  Cross tab is like scatter plot – you can eyeball distribution in a scatterplot  When you look at cross tab – you can get some sense of variation  How do you know indep matters in dep – if values don’t change across rows its not imp  How is that data clustered?  Where are the data clustered – intersecting in same order pairs?  Low and low – same ordered pairs  Where is by data situated  Eyeballing table is problematic  If everything was 50% in all cell – that’s no relationship  How strong of a relationship is that – cant tell by cross tab  Lambda, gamma, chi sq – all helpful  These are measures of association  Lambda – nominal  Gamma for ordinal  They are related to r2 because they are based on proportional reduction error  They all refer to the same error – when we add variable are we reducing error in explaining variation (outcome)  Like pearson correlation efficient – PRE ranges from -1 and +1  No relationship = 0  +1 = perfect positive  -1 = perfect negative  When indep is ignored  Total amount of error when we don’t have x minus error based on when x is present  This formula is looking at errors in relation to each other  Want error with x to be as small as possible Abortion  Opposite of error is mode  What doesn’t fit hyp is error – not in mode  E1 when you don’t account for x is 84  And E2 when you account for x is smaller – 72  The smaller e2 is the greater PRE will be  Value of .14 is saying something imp about relationship  Positive – more children they have, more likely to support abortion  But number close to 0 – it’s a fairly week relationship  .5 is moderate  Smaller e2 would make relationship better Protest Participation  A lot of variables are yes and no – binary  Cannot use ols  Cant interpret binary as – every one unit increase in x change ___ in y  There is a way around it – tal
More Less

Related notes for SOC350H5

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit