17 Apr 2012

School

Department

Course

Professor

March 26th 2012

Crim 320

Research Methods in Criminology (Crim 320)

Bivariate Correlation and Partial correlation

Bivariate Correlation

- Bivariate correlation

- Karl Pearson’s Product-Moment Correlation (f)

Bivariate Correlation

- Topics

o What is a correlation?

o When it should and should not be used?

o What are the underlying assumptions?

o How to conduct a correlation?

o How to test simple hypothesis

Context

- Two continuous (ratio/interval) variables

- Association between two variables (rxy)

- Exploration purpose

- Investigate relationship before conducting multivariate analysis

The mean score is very important, therefore you can’t use categorical variables.

There is a hypothesis that lower iq is likely to be convicted but not more self-reported delinquency

More delinquent friends may be associated with more self reported delinquency. But a third variable

could be that they participants also have a delinquent member in the family.

Larger family size -> delinquency. Third indicator – supervision

Bivariate Correlation

- Research question

o Is there a r/s b/w age of onset the criminal career and the volume of crime committed?

- Hypothesis:

o H1: the earlier the age of onset of delinquency, the more crimes will be committed

- Null hypothesis:

o H0: The age of onset of delinquency is not related to the number of crimes committed

X age of onset

Age of onset and delinquency

The direction of the bivariate correlation, positive or negative?

The strength of the bivariate correlation

- The coefficient varies from -1.00 to 1.00

- Correlation is a standardized measure

o Score divided by standard deviation

- Easier to interpret than covariance

o Metric of both variables is often different

Guidelines for interpreting the strength of the relationship:

- .00=none

- .10=weak

- .20=low

- .40=moderate

- .60=substantial

- .80=strong

- .90=very high

- 1.00=perfect relationship <-problematic when between an IV & DV. Good when two items are

measuring the same thing.

What is the average correlation of the risk factor with the measure of crime and delinquency? It usually

varies between .10 to .20. .20 is actually very good. Why is that, why so low? Because there are other

factors involved. Therefore, we do not see STRONG association between two indicators.

The higher the correlation is, it means you are basically measuring the same thing. You should be

worried if your correlation is TOO strong. If measuring psychopathy, you expect a very high correlation

Y # convictions

XY

incl

ude

()in(

)

between all 10 indicators (items) of psychopathy, because they are tapping into the same time. But do

not say they predict or explain the other indicators (items)

If your hypothesis is supported and you reject the null hypothesis, you should find something between

.10 to .20, if you find something stronger than that, .. when you are looking between the correlation of

delinquency of friends and own delinquency, they should be correlated. Hrischi was a big critique of that

explanation. Because the crime you committed, there is a good chance htat you committed htem with

your friend, so you aren’t explaining much of a predictor, just htat most people commit crime with

others. Whether it is forced or intentional.

When you are doing correlation, you can’t determine where your association is coming from. What is

really happening, its not going to answer you about the direction of the association. Whether the friend

is creating the high SRD or the other way around.

Coefficient of determination (r2) <- r squared

- Percentage of variation of Y accounted by X

- The square of Pearson’s r, multiplied by 100

- rxy =.60 (substantial)

o Therefore, 25% of variance of Y accounted by X

- rxy=.30 (low-moderate)

o Therefore, 9% of variance of Y accounted by X

- rxy=.1- (weak)

o Therefore, 1% of variance of Y accounted by X

Assumptions:

- Both variables are normally distributed

- Absence of outliers

- Linear relationship between variables

- Measurement error is minimal

- Unrestricted variance

Both variables are normally distributed

- Inspect level skewness

- Divide skewness by its standard error (Z score)

- When Z score close to 0, normal distribution

- When Z score > 3.00, skewness significant at p. 001, indicative of a significant departure from

normality

Absence of outliers

- Extreme scores that may exaggerate/minimize the presence of an association