Class Notes
(809,487)

Canada
(493,752)

Queen's University
(16,840)

Sociology
(1,094)

SOCY 211
(69)

Carl Keane
(12)

Lecture

# SOCY211 Week 11, Lecture 1

Unlock Document

Queen's University

Sociology

SOCY 211

Carl Keane

Winter

Description

Bivariate Regression
(Continuation/Review from last week)
- With bivariate regression, you want to construct a scatter plot graph
- Computer programs will try to fit a line to the data on the scatter plot
- You can have a positive association that is strong or weak, and you can have a negative association that
is strong or weak
- No relationship is when all the dots are spread out in no particular pattern
- Linear regression line/ least squares line fits the data (referred to as OLS- Ordinary Least Squares
Regression)
- We want to predict according to the line that is drawn
- You may have residuals or errors which are cases that do not fit on the line
- Regression Equation: Y= a + b(x)
- It is possible to use variables in the ordinal level measurement
- You cannot use the dichotomous variable as a dependent variable with OLS
- Review of example from last week Look at last weeks’ notes
MULTIPLE REGRESSION
- Best fitting straight line to summarize the relationship between a dependent variable and in this case,
two independent variables
- Y = a + b x1+ 1 x 2 2…
- Y = dependent variable score that we want to predict
- a = Y Intercept
- b 1 the partial slope of the linear relationship between the first independent variable and Y
- b 2 the partial slope of the linear relationship between the second independent variable and Y
- x 1 the first independent variable score
- x 2 the second independent variable score
- b is associated with x therefore b goes withx
1 1
- The subscript for the slope (b 1,2,3,etc)ntifies the independent variable it is associated with. (The
independent variables are sometimes called “Regressors,” or “Predictors”)
- PARTIAL SLOPES show the amount of change in the dependent variable (Y) for a one unit change in
the independent variable (x), while controlling for the effects of the other independent variables in the
equation.
- Multiple regression allows us to control for virtually any number of independent variables
- Multiple regression is useful for isolating the separate effects and predicting scores on the dependent
variable
- In many situations this formula makes it difficult to determine the relative importance of the different
independent variables
- When we’re using all the different variables measured in different ways it is difficult to know which
variables are the important ones
o Can be difficult to know which variable is having the greater impact
- We can make all of our independent variables comparable to converting them to a common scale
- We can do this by converting all the variable scored to “Z SCORES”
Each distribution of scores for each variable will have a mean of zero and a standard deviation of one
- So now we are sure we are comparing the same thing which makes it much easier. We have
standardized our variables
- STANDARDIZED PARTIAL SLOPES = BETA WEIGHTS = BETA COEFFICIENTS
- Describing the effect of the dependent variable, on the independent variable
- The standardized beta coefficient reports the amount of standard deviation change in the dependent
variable given one standard deviation change in the independent variable.
- MULTIPLE CORRELATION COEFFICIENT- R – this statistic allows us to explain how much impact
1 the variables are having
- The value of R represent the proportion of the variance in Y on the dependent variable that is explained
by all the independent variables in our model
- Adjusted R - its considered to be a slightly better estimate of the combine effects
o It takes into consideration the number of independent variables that are being used
ASSUMPTIONS FOR MULTIPLE REGRESSION
- Must be met before it can be used
1. Each independent variable is assumed to be related in a linear fashion to the dependent variable.
2. The variables are normally distributed.
3. The sample size is large enough (minimum of 50 cases).
4. The independent variables are not highly correlated with each other. “COLLINEARITY”
“MULTICOLLINEARITY”
5. The effects of the independent variables are additive, with no interaction between the variables.
Y = a + b x + b x + ....
1 1 2 2
Something extra happens when we combine two variables such as age combined with gender
6. Interval/ratio level variables. But we often use ordinal level variables as well (Dummy Variables)
Example for Multiple Regression
Total Income regressed on Age, Sex, Years of Schooling, and Number of Hours Worked Per Week
Correlation Matrix
age in hrs work for pay sex total income total years of
yearsor self-emp in ref $ schooling
wk
age in Pearson 1 -.189-.020 .170 -.255
yearsCorrelatio
n
Sig. (2- . .000 .437 .000 .000
tailed)
N 1500 1500 1500 1500 1500
hrs work Pearson -.189 1 .229 .441 .329
for pay orCorrelatio
self-emp n
in ref wk
Sig. (2- .000 . .000 .000 .000
tailed)
N 1500 1500 1500 1500 1500
sex Pearson -.020 .229 1 .255 -.027
Correlatio
n
Sig. (2- .437 .000 . .000 .291
tailed)

More
Less
Related notes for SOCY 211