false

Textbook Notes
(368,214)

Canada
(161,710)

York University
(12,820)

Psychology
(3,584)

PSYC 2030
(144)

Krista Phillips
(15)

Chapter 11

Unlock Document

Psychology

PSYC 2030

Krista Phillips

Fall

Description

Chapter 11: Correlating Variables
What Are Different Forms of Correlations?
Researchers view variables not in isolation, but as systematically and meaningfully associated
with, or related to, other variables
Correlation coefficient -> a single number that can be used to indicate the strength of
association between two variables (X and Y) – this chapter elaborates on this
In particular, we describe linearity -> correlations that reflect the degree to which mutual
relations between X and Y resemble a straight line
The Pearson r -> short for Karl Pearson’s product-moment correlation coefficient, is the
correlation coefficient of choice in such situations
Values of r of 1.0 (positive or negative) indicate a perfect linear relation (a fixed change in one
variable is always associated with a fixed change in the other variable), whereas 0 indicates that
neither X nor Y can be predicated from the other by use of a linear equation
A positive r tells us that an increase in X is associated with an increase in Y, whereas a negative r
indicates that an increase in X is associated with a decrease in Y
We begin by examining what different values of r might look like.
Then we go through the steps in computing the correlation coefficient when raw data have
different characteristics
The common names “Pearson r”, “point-biserial r,” and “phi” listed in the table (table 11.1),
communicate whether the values of X and Y are continuous or dichotomous, although the name
Pearson r also is often used in a general way to refer to any correlation computer as product-
moment r
Continuous variable -> means that it is possible to imagine another value falling between any
two adjacent scores
Dichotomous variable -> the variable is divided into two distinct or separate parts
o Ex. Someone who studies the discrimination of pitch (highness or lowness of a tone)
might be interested in correlating the changes in the frequency of sound waves (X) with
the differing ability of individuals to discriminate those changes (Y). Both variables are
continuous, in that we can imagine a score of 1.5 between 1 and 2 or 1.55 between 1.5
and 1.6.
o Suppose a researcher was interested in correlating participants’ gender with the ability
to discriminate pitch
o Pitch discrimination is a continuous variable, whereas gender is dichotomously coded as
male and female
Box 11.1 Galton, Pearson and r
In chapter 1 where we first discussed the idea of how empirical reasoning is used in behavioral
research, we mentioned Francis Galton’s fascinating relational study using longevity data to test
the efficacy of certain prayers
Galton was also very intuitive about statistics and he instinctively came up with a way of
measuring the “co-relation” between two variables
At the time, another of his many interesting projects concerned the relationship between the
traits of father and their adult sons
One day, while he was strolling around the grounds of a castle, it started to rain and Galton
sought refuge in the recess of a rock by the side of the pathway It was there, he later recalled, that while thinking about his research, the notion of statistical
correlation initially flashed across his mind
Though the word correlation was already in widespread use in physics, it is believed that
Galton’s initial spelling of “co-relation” might have been a way of distancing his creation from
the commonly used concept
Though the statistical concept for which he is best known is correlation, Galton did not develop
the idea beyond its use in some of his relational studies
The reason that r is called the Pearson r is that it was Karl Pearson who perfected Galton’s
“index of co-relation” in a more mathematically sophisticated way
Correlation (r-type) indices have other useful applications besides those that are mentioned in
this chapter
In the case of dichotomous variables, we might create dichotomies in what is called a median
split, by dividing variables at the median point.
o Researcher might report “r-type effect sizes” on more than two conditions, which we
discuss in chapter 14
Other important applications are beyond the scope of this book but are illustrated in our
advanced text
For example, in a partial correlation, a researcher can measure the correlation between two
variables when the influence of other variables on their relation has been eliminated statistically
As correlations usually shrink in magnitude when the variability of either of the two samples
being correlated shrinks, there is a statistical solution (proposed by Karl Pearson) to correct for
this “restriction of variability”
Main purpose of this chapter is to give you a working knowledge of the basics of computing and
interpreting correlations in the situations that you are most likely to encounter
Table 11.1 Four Forms of Correlations and Their Common Names
1) Pearson r -> Two continuous variables, such as the correlation of scores on the Scholastic
Assessment Test (SAT) with grade point average (GPA) after four years of college
2) Point-biserial r (rp) -> one continuous and one dichotomous variable, such as the correlation of
subject’s gender with their performance on the SAT-Verbal
3) Phi coefficient (o with line down middle) -> two dichotomous variables, such as the correlation
of subject’s gender with their “yes” or “no” response to a specific question
4) Spearman rho (r s) -> two ranked variables, such as the correlation of the ranking of the top 25
college basketball teams by sports writers (Associated Press ranking) with the ranking of the
same teams by college coaches (USA Today ranking)
How Are Correlations Visualized in Scatter Plots?
In addition to the graphics described in the preceding chapter, another informative visual
display is called a scatter plot (or scatter diagram)
It takes its name from looking like a cloud of scattered dots
Each dot represents the intersection of a line extended from a point on the X axis (the horizontal
axis, or abscissa) and a line extended from a point on the Y axis (the vertical axis, or ordinate)
For now, we will concentrate on the raw scores (the X 1and the X2scores) of these 10 students on
the two exams Figure 11.1 displays the scores shown in table 11.2 in a scatter plot
Imagine a straight line through the dots
The higher the correlation is, the more tightly clustered along the line are the dots in a scatter
plot (and therefore, the better is the linear predictability)
The cloud of dots slopes up for positive correlations and slopes down for negative correlations,
and the linearity becomes clearer as the correlation becomes higher
From this information, what would you guess is the value of the Pearson r represented by the
data?
How Is a Product-Moment Correlation Calculated?
There are many useful formulas for calculating different forms of the product-moment
correlation coefficient (r)
The following formula (which defines Pearson r conceptually) can be used in most situations:
o R x= sum of Z Zy/ N
This formula indicates that the linear correlation between two variables (X and Y) is equal to the
sum of the products of the z scores (the standard scores) of X and Y divided by the number (N)
of pairs of X and Y scores
The name product-moment correlation came from the idea that the z scores (in the numerator)
are distances from the mean (also called moments) that are multiplied by each other (Z Zy) to
form “products”
To use this formula we begin by transforming raw scores (X and Y scores, in some case X1 and
X2) to z scores by following the procedure described in the previous chapter
o This is to say we calculate the mean (M) and the standard deviation of each column of X
and Y scores and then substitute the calculated values in the (X-M)/ standard deviation
formula where X is any student’s score
Rxy is rounded to two decimal places = .90
It’s easier to just use a computer program to calculate r or calculator
Can also calculate the Pearson r from raw scores rather than z scores using another formula,
which is easier than the conceptual formula
How Is Dummy Coding Used in Correlation?
Point-biserial correlation (pb) -> another case of the product-moment r. the point means that
the scores for one variable are points on a continuum, and the biserial means that scores for the
other variable are dichotomous
in many cases, the dichotomous scores may be arbitrarily applied numerical values, such as 0
and 1, or -1 and +1
the quantification of two levels of a dichotomous variable is called dummy coding when
numerical values such as 0 and 1 are used to indicate the two distinct parts
dummy coding is a tremendously useful method because it allows us to quantify any variable
that can be represented as dichotomous (also called binary, meaning there are two parts or two
categories)
for example, suppose you have performed an experiment in which there were two groups (an
experimental and a control group) and you want to correlate group membership with scores on
the dependent variable
to indicate each participant’s group membership, you code 1 for experimental group and 0 for
control group
another dichotomous independent variable that is typically recast into 1s and 0s is gender not only dichotomous independent variables can be dummy-coded this way, but also
dichotomous dependent variables can be recast into 1s and os such as success rate (1 = succeed
vs. 0 =fail)
Box 11.2 Linearity and Nonlinearity
Pearson r is a measure of linearity
though r is clos

More
Less
Related notes for PSYC 2030

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.