# Chapter 5-7 Textbook Notes

Chapter 5
The most common measurement strategy is to ask people to tell you about
themselves
You can also directly observe behaviours
How many mistakes did someone make
Psychological and neurological responses can be measured as well.
Example: heart rate, muscle tension
Reliability of measures
Reliability refers to the consistency or stability of a measure of behaviour
a reliable test would yield the same result each time
the results should not fluctuate from one reading to the next
if there is fluctuation there is an error in the measurement device
Every measurement has two components:
1.True score: the real score on the variable
2.Measurement error
Example: If you administer a highly reliable test multiple times, the scores on
them might be 97-103; however if you used an unreliable test the scores might be 85-
115. The measurement error in the unreliable test is revealed in the greater variability
shown in the scores
Using unreliable measurements is a waste of time because the results will be
unstable and are unable to be replicated
We can assess the stability of measures using correlation coefficients
oThe most common correlation coefficient when discussing reliability is the
"Pearson product-moment correlation coefficient"
Symbolized as "r"
Range from 0.00 to +1.00 and 0.00 to -1.00
0.00 means that the variables are not related at all
+1.00 means that there is a positive relationship
While -1.00 means there is a negative relationship
Test-retest reliability: assessed by measuring the same individuals at two points in
time
oIf many people have similar scores we can say that the measure reflects
true scores rather then measurement error
0.80 is how high the correlation should be before we accept the measure as
reliable
Internal consistency Reliability
The assessment of reliability using responses at only one point in time, because all
items measure the same variable they should yield similar or consistent results
oAn indicator of internal consistency is "split-half reliability"
Split-half reliability: this is the correlation of an individual's total score on one
half of the test with the total score on the other half
oThe final measure will include items from both halves
The combined measure will have more items and will be more reliable
than either half by itself
oDrawback of this is that it does not take into account each individual item's
role in a measure's reliability. (each question on test is called an "item")
Cronbach's alpha: is based on individual items and is another indicator of Internal
consistency Reliability
oCorrelates each item with every other item
oThe value of alpha is the average of all correlation coefficients
Item-total correlations: examines the correlation between each time and the total
score
Since cronbach's alpha and item-total correlations look at the individual items, items that
do not correlate with the other items are removed to increase reliability
Interrater reliability
A single rater might be unreliable but more the one will increase reliability
The degree to which raters agree in their observations is interrater reliability
oA commonly used indicator of interrater reliability is called Cohen's kappa
Reliability and accuracy of measures
Accuracy and reliability are totally different
oExample: A gas station pump puts the same amount of gas in your car every time,
therefore the gas pump gauge is reliable. However the issue of accuracy is still open.
The only way you can know the accuracy is to compare how much the pump gives
you to a standard measure of a litre.
Construct Validity of measures
Construct validity: the adequacy of the operational definition of variables
oTo what extent does the operational variable reflect the true theoretical meaning
of the variable
oConstruct validity is a question of whether the measure employed actually
measures the construct it is intended to measure
Indicators of construct validity
Face validity: the evidence for validity is that the measure appears "on the face of it" to
measure what it is supposed to measure.
oDo the procedures used to measure the variable appear to be an accurate
operational definition of the theoretical variable?
Criterion-oriented validity: relationship between scores on the measure and some on
criterion
oThere are 4 types of criterion-related research approaches that differ in the type of
criterion that is employed
1.Predictive validity: scores on the measure predict behaviour on a criterion
measured at a time in the future
oExample: LSAT test predicts how well you'll do in law school
2.Concurrent validity: scores on the measure are related to a criterion
measured at the same time
oTo see whether two or more groups of people differ on the measure
in expected ways
3.Convergent validity: scores on the measure are related to other measures of
the same construct
oOne measure of shyness should correlate with another shyness
measure or a measure of a similar construct such as social anxiety
4.Discriminant validity: scores on the measure are NOT related to other
measures that are theoretically different
oSeeing if there are correlations between shyness test results and
aggressive/forcefulness test results
Research on personality and individual differences
Systematic and detailed research on validity is most often carried out measures of
personality and individual differences
NEO personality Inventory (NEO-PI)
oMeasures the 5 major dimensions of personality: neuroticism, extraversion,
openness to experience, agreeableness and conscientiousness
Reactivity of measures
Reactivity is a potential problem when measuring behaviour
oA measure is said to be reactive if awareness of being measured changes an
individual's behaviour
Reactive measures don't tell you how the subject behaves under natural
settings
You can minimise reactivity by letting the subjects get used to the
recording equipment or to the presence of the observer
Nonreactive or unobtrusive measures involve clever ways of indirectly recording a
variable
Variables and measurement scales
Each variable that is studied must be operationally defined
oThe specific method used to manipulate or measure the variable
There must be at least two values or levels of the variable
