Chapter 5 – Measurement Concepts
The most common measurement strategy is to ask people to tell you about themselves
Ie. Rate your overall happiness
You can also directly observe behaviours
Ie. How many mistakes did someone make on a task
Psychological and neurological responses can be measured as well.
Ie. heart rate, muscle tension
In this chapter we need to consider the technical aspects of measurement: reliability, validity, and
reactivity of measures
Reliability of measures
Reliability refers to the consistency or stability of a measure of behaviour
A reliable test would yield the same results each time when given to the same person
A reliable measurement doesn’t fluctuate from one reading to the next
If there is fluctuations in readings then there is an error in the measurement device
Every measurement has two components:
1. A true score: the real score on the variable
2. A measurement error
An unreliable measure contains considerable measurement error; an unreliable measure contains little
measurement error (it’ll yield an identical, or nearly identical, score each time the same person is
Example: If you administer a highly reliable test multiple times, the scores on them might be in the range
of 97-103 with the average at 100; you also used an unreliable test where there scores range from 85-115
with the average at 100. The measurement error in the unreliable test is revealed in the greater variability
shown in the scores (see figure 5.1, pg 92)
When doing actual research, you can’t administer a test several times and then take the average;
therefore, it’s essential that the test be reliable so that it could closely reflect the person’s true score
Trying to study behavior using unreliable measures is a waste of time because the results will be unstable
and are unable to be replicated
Reliability is most likely to be achieved if researchers use careful measurement procedures
We can assess the stability of measures using correlation coefficients
The most common correlation coefficient when discussing reliability is the "Pearson product-
moment correlation coefficient"
Symbolized as "r"
Ranges from 0.00 to +1.00 and 0.00 to -1.00
•0.00 means that the variables are not related at all
•A + means that there is a positive relationship, whereas a – means that there is a negative
•The higher the value, the stronger the relationship (ie. +0.8 is stronger than –0.8)
To assess reliability, we need to obtain at least 2 scores on the measure from many individuals. If the
measure is reliable, the scores should be very similar. r should be highly positive. This is going to be
called a reliability coefficient.
Test-retest reliability is assessed by measuring the same individuals at two points in time
If many people have similar scores we can say that the measure reflects true scores rather than
In most cases the reliability coefficient should be greater than 0.80 before we accept the measure as
Internal Consistency Reliability
It is possible to measure reliability by measuring individuals at only one point in time. We can do this
because most psychological variables are made up of a number of different questions, called items.
Reliability increases with increasing numbers of items.
Internal consistency reliability is the assessment of reliability using responses at only one point in time.
Because all items measure the same variable, they should yield similar or consistent results.
One indicator of internal consistency is split-half reliability, another being Cronbach’s alpha
Split-half reliability is the correlation of an individual's total score on one half of the test with the total
score on the other half
The two halves are created by randomly dividing the items into 2 parts
The final measure will include items from both halves
The combined measure will have more items and will be more reliable than either half by itself
Split-half reliability is relatively straight forward and easy to calculate
One drawback of this is that it does not take into account each individual item's role in a measure's
Another internal consistency indicator is Cronbach’s alpha
Cronbach's alpha is based on the individual items
Here the researcher calculates the correlation of each item with every other item
A large number of correlation coefficient’s are produced
The value of alpha is based on the average of all the inter-item correlation coefficients and the number
of items in the measure
Again, note that more items are associated with higher reliability
It’s also possible to examine the correlation of each item score with the total score based on all items.
Such item-total correlations and Cronbach’s alpha are very informative because they provide
information about each individual item.
Items that do not correlate with the other items are removed to increase reliability
the most common measurement strategy is to ask people to tell you about themselves. How many mistakes did someone make on a task. psychological and neurological responses can be measured as well. in this chapter we need to consider the technical aspects of measurement: reliability, validity, and reactivity of measures. reliability refers to the consistency or stability of a measure of behaviour. a reliable test would yield the same results each time when given to the same person. a reliable measurement doesn"t fluctuate from one reading to the next. If there is fluctuations in readings then there is an error in the measurement device. every measurement has two components: a true score: the real score on the variable, a measurement error. an unreliable measure contains considerable measurement error; an unreliable measure contains little measurement error (it"ll yield an identical, or nearly identical, score each time the same person is measured)