Lecture

March 24

Sociology
SOC 325
Elizabeth Quinlan
Winter

Soc 325 March 24 2014 1 -- Lab next Monday in Arts 40, basement computer lab! ----------------------------------------------------------- • Correlation: (Pearson R) measure of the strength of a relationship between 2 interval- ratio variables • Regression: a technique that gives us the form of the relationship between 2 interval- ratio variables, X and Y, that bests predicts values of Y based on values of X o Regression is not symmetrical, X and Y, so if you reverse the independent and the dependent you will get a different picture o Generally a rule for sorting the form, which comes first in time?  First: independent  Second: dependent Assumptions of correlation and regression 1. Both variables are normally distributed 2. The relationship is linear and homoscadastic o Homoscadastic: general shape forms a cigar o The further from the shape the more tentative we have to be in the language about conclusions  "we see the following with caution ….. " • Creating scatterplots in SPSS o https://www.youtube.com/watch?v=H7Fz1dCiLZk SPSS • Correlation: o Analyze -> correlate --> choose bivariate --> put in variables --> click Pearson r (which is the default) --> produces a correlation matrix (upper right mirrored in bottom left) --> Pearson's r .389 so there is a strong relationship (larger than .3) --> r-squared = 0.151  Job tenure and usual hourly wages -- variables • Regression: o Analyze --> regression --> linear --> enter independent and depend separately (hourly wages = dependent, job tenure = independent) --> constant = a  Y = a + bx  Y = constant + (second line)x  Y = 18.22 + .051X  A = rate of pay for someone with no job tenure  Average starting wage  Value of y when x=0  B = increase in dependent value for every unit of increase in the independent variable  5 cents for every additional month for being on the job  We can use the regression equation to predict the starting wage for someone at any number of months  Example, someone who worked for 30 months  Y = 18.22 + 0.051(30) = 19.75 Soc 325 March 24 2014 2  Don't have to have a person in the data set to know what their wage will be/hour o More summary: shows are r and r-squared --> we can use for a PRE measure  The adjusted R-squared sometimes differs when we work with multiple regression (more than one independent variable) o Std: error of estimate: the std deviation when the independent variable is moved o Anova: reports f-statistics (test statistic for regression) to determine if coefficients are significant or not  Significance: less than 0.05  Means that the coefficients of the regression model are statistically significantly different from zero  f-statistic is the ratio of the 2 numbers from the column before (mean- squared top/mean-square bottom)  Sum of squares total = error one  Sum of squares f residual: error two  Error one - error two = regression  r-squared = explained/total Chapter 14: • Three and more variables o Multivariate analysis 1. Partial correlation  Types of outcomes we can expect  Will the relationship between X and Y retain its strength and direction after introducing another variable, Z? 
