PSY248 Lecture Notes - Lecture 4: Error Bar, Multicollinearity, Linear Regression

91 views9 pages

purpledinosaur123

2 Jun 2018

School

Department

Course

Professor

For unlimited access to Class Notes, a Class+ subscription is required.

PSY248: Week 4 – multiple regression

→ what we have done so far is simple linear regression, where we fit a straight line to

describe the relationship between a predicted variable and an outcome variable

→ y = a +bX + e

• It is possible to have multiple predictor variables (so multiple X’s)

• Y = dependent variable

• You will still go through the 3 stages: univariate, bivariate, perform regression

& check assumptions

• Only difference is we have more than one predictor

Multiple regression for mental impairment

• Outcome (DV) is a measure of Mental Impairment, general psychiatric

symptoms

• Possible IVs are the two predictor variables:

o Life events score

o Socioeconomic status

• We start to recognize that our variable of primary interest may not be the only

relevant IV

• DV is a measure of general mental impairment (psychiatric symptoms

including depression and anxiety)

• The two IVs used here are X1 = life events and X2 = socioeconomic status

• Life events refers to score on a life events index, including both number of life

events and severity of events experienced in the past 3 years

• Life events is our IV of primary interest. The research question is whether

more frequent (and severe) life events predicts higher mental impairment

Steps in doing this multiple regression

• 1. Recognise problem as a multiple regression

• 2. Remember RQ

• 3. Univariate data description – graphical and numerical

• 4. Bivariate Graphical data description

• 5. Produce correlation matrix (Pearson’s r)

• 6. Fit full model (if appropriate to 2.)

• 7. Reduce Full Model (if appropriate)

• 8. Fit Final Model and Report

Steps 1-2:

• Recognize problem as a multiple regression

o 1 numeric, DV, 2 numeric IVs

• Consider theory: more frequent and severe life events should generate more

psychiatric symptoms

• Write (draw) RQ: do life events and socioeconomic status together predict

mental impairment? If so, are both predictors required?

• Y = mental impairment

• X1 = life events

• X2 = SES

find more resources at oneclass.com

Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in

Understand population and data

• Understand sampling population: Florida adults

• Understand unit of analysis: general community members

• Check all IVs numeric → yes

• Ordinal variables checked before upgrading → no ordinal variable

• Consider variable-to-predictor/IV ratio rule of thumb =

o Look at the number of IVs and then you multiply that by two random

numbers

o N > 5*p is bare minimum

o N > 10*p more desirable

o Here we have a sample of 40 >> 10*2

3. Univariate data description

• Produce graphical summaries (histogram, error bar plot)

• Comment on distributions for BOTH IVs (central tendency, variability, skew,

kurtosis etc.)

• Summarise with appropriate numerical values

• Write global summary statement of what you have found

• SPSS menu: graph → legacy dialogs → graph (tick display normal curve)

• Descriptive statistics

o You can see the three means of the variables and the sample sizes (N)

o Listwise N = number of cases that have valid values for all variables in

the table

• The three variables were approx. normally distributed. The dependent

variable, Mental Impairment, ranged from 17 to 41 (mean 27, SD, 5.5). For

the two predictors life events ranged from 3 to 97 and parents years of

education ranged from 3 to 96

4. Bivariate data description

• Plot DV against each IV

• Comment on scatterplots (7 points)

• Consider outliers

• Write global summary statement of what u have found

• The scatterplots for mental impairment against life events and SES show a

positive and negative linear relationship respectively. Both relationships

appear low-moderate strength and only low correlation. The graphs show no

unusual characteristics although there may be one outlier for the relationship

between MI and SES

5. Produce correlation matrix

• Consider colinearity, multicolinearity

• Consider statistical significance of DV correlation with each IV

• Consider correlations between the two IVs

• Write summary statement on what you found

Collinearity (multicollinearity) occurs when two or more IVs are so correlated that

one can be predicted (almost) exactly from one or more of the others

• Made possibly by a combination of

o Multiple IVs

find more resources at oneclass.com

Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in

o Non-orthogonality due to observational design

• To occur requires strong correlations between IVs

• Multicollinearity exists whenever an IV can be exactly/nearly calculated from

a linear combination of other IVs

• Indicators

o Large correlations among IVs

o Large changes in coefficients and/or SE, when a new IV is added to

the model

Scatterplot of IVs to look at correlation/collinearity

• Appearance of IVs confirms weak correlation (look at diagram), thus no

collinearity

• This plot and the earlier correlation work for 2 IVs but if >2 IVs there maybe a

more complex pattern of IV correlations

Correlation table:

• You can see that mental impairment (DV) is positively correlated with life

events

• But negatively correlated with SES

• Correlation between life events and SES = positively correlated but very small

• Correlation of each X and Y (IV and DV) = both moderate and significant

(0.05)

• Thus, the chance of obtaining a Pearson correlation of .372 based on a sample

of 40, when the null hypothesis is true, is 1.8% (.018) → ask yourself if it is

sufficiently unlikely?

• Can define sufficiently unlikely as whatever percent you want depending on

the %

Rule of thumb – if the correlation between two predictors is above +0.7, it is possible

collinearity. If the correlation between two is above 0.8, we have definite collinearity

(we find these stats in Pearson Correlation)

• A change in the sample will help change stat to collinearity (e.g. changing

from 40 to 40,000)

But… bivariate correlations may not be sufficient to identify collinearity

• One IV may be a non-obvious linear combination of several other IVs

• SPSS calculates a thing called tolerance and variance inflation factor; these are

found on the coefficients table under ‘collinearity’ – the statistic shown shows

the degree to which the IVs are correlated with each other (how much variance

they share when predicting mental impairment)

Summary of step 5 (correlation)

• Both predictors are statistically significantly correlated with the DV. Life

events is positively correlated with Mental impairment (r = 0.37; p = 0.018)

while SES has a correlation of similar degree, but negative, with Mental

impairment (r = -0.40, p = 0.011). The two predictors are very weakly

correlated (r = 0.12, p = 0.45) and thus give us no reason to expect

collinearity.

find more resources at oneclass.com

Unlock document

This preview shows pages 1-3 of the document.
Unlock all 9 pages and 3 million more documents.

Already have an account? Log in

Document Summary

What we have done so far is simple linear regression, where we fit a straight line to describe the relationship between a predicted variable and an outcome variable. It is possible to have multiple predictor variables (so multiple x"s: y = dependent variable, you will still go through the 3 stages: univariate, bivariate, perform regression. & check assumptions: only difference is we have more than one predictor. The research question is whether more frequent (and severe) life events predicts higher mental impairment. Steps in doing this multiple regression: 1. Recognise problem as a multiple regression: 2. Univariate data description graphical and numerical: 4. Fit full model (if appropriate to 2. : 7. If so, are both predictors required: y = mental impairment, x1 = life events, x2 = ses. The dependent variable, mental impairment, ranged from 17 to 41 (mean 27, sd, 5. 5). Both relationships appear low-moderate strength and only low correlation.

PSY248 Lecture Notes - Lecture 4: Error Bar, Multicollinearity, Linear Regression

Document Summary

Get access

Related Documents

CRIM 320 Lecture Notes - Mahalanobis Distance, Scatter Plot, Analysis Of Variance

Psychology 3580F/G Lecture Notes - Lecture 4: Coefficient Of Determination, Concurrent Validity, Regression Analysis

PSYC3010 Lecture Notes - Lecture 6: Linear Regression, Analysis Of Covariance, Pearson Product-Moment Correlation Coefficient