Measurement, reliability and validity
Measurement in psychology
- What is psychology
- What should we measure
• Types of measures
• Consistency in measurements
• Make sure every time you get a measure you are getting something similar every time
• Measures what you are supposed to measure
• If theory is completely wrong you won’t have validity
What is psychology?
- Psychology is the study of the mind and behavior. The discipline embraces all aspects of
the human experience — from the functions of the brain to the actions of nations, from
child development to care for the aged. In every conceivable setting from scientific
research centers to mental health care services, "the understanding of behavior" is the
enterprise of psychologists.
What should we measure to explain mind and behavior?
• Can look at accuracy, confidence judgment, and imitation
• What is leading to the behavior that can also be observed
• It can happen naturally or the experimenter can control environment to see what
behavior would be in response to the environment
- Behaviorism • Watson, Skinner etc. Used to believe all you had to do was study behavior to
understand psychology. It crashed and burned because…
Behavior can’t explain intrinsic knowledge or instinctual knowledge
The cognitive revolution was about to happen
There was too much you couldn’t explain (language, animal instincts, etc…)
• Constructs, a state of mind, that has causal consequences on your behavior
• Look at behaviors and decide if they lead to the idea of the construct
- Physiological measures
• Are there certain genes that predispose some people to overeating? Why do some
people do things and others don’t?
• People are the way they are now. Why is that?
• Why did we become this way?
• Ontogeny: when you can study something for what it is in the moment you are
• Phylogeny: trying to understand how the being became in the first place
Harder to come by than ontogeny
How do researchers choose their measures?
- Area of interest / area of expertise
- Type of problem to be addressed (basic vs. applied research)
• Basic- understand a phenomena for what it is
- Maturity of the research area
• Sometimes you have to start from scratch and it is difficult. Or, it can be mature and
you ask very specific questions
- Available funding - Basic research : building the body of literature, important
- The complexity of psychology as a problem makes it unlikely that the field will develop a
Aresearch example: dyslexia
Class experiment on pronunciation
- Point: progressively increases in difficulty and looking at ability to read single words.
The words are representative of ability to read any single word. If you can pronounce one
of the words, you can probably pronounce other words similar
- The phonological hypothesis postulates that dyslexics have a specific impairment in the
representation, storage and/or retrieval of speech sounds.
- Faster kids understand that words have sound associated the faster that they’re able to
- Teach alphabet and teach the sound at the same time
- This explains development dyslexia. Have trouble putting sounds together to put words
- Good readers (children and adults) rely on individual words rather than on context when
reading (West & Stanovich, 1978)
- Eye tracking studies show that adults processes phonology while reading (Rayner,
Sereno, Lesch, & Pollatsek, 1995)
- Phonemic awareness is the main predictor of reading success (or failure) in children
(Torgesen, Wagner, & Rashotte, 1994)
- Interventions that focus on phonological processing (Vellutino & Scanlon, 2002) are
successful for 98.5% of children
• Tutoring for bad reading children, several times a week. Taught phonology. Within 6-
9 months 98.5% caught up to the average!
• With cash and resources you can teach all of Canada to read (25% of Canadians are
illiterate) - If kids haven’t reached appropriate level in the fourth grade they are likely to not catch up
Other predictors of reading
- Basic research. What could other causes of dyslexia be?
- Applied research. How can we make reading interventions better (with a better
understanding of dyslexia)?
- Phonological processing is number one predictor. Depending on the spin (basic/applied)
you will ask different questions.
The magnocellular theory
- The magnocellular pathway is a component of the visual system that processes
- Hypothesis: Individuals with dyslexia may have a magnocellular deficit.
- Magnocellular deficit problems with processing movement
- Idea is that some readers have an atypical wide pathway and makes it difficult to process
Davis, Castles, McAnally, and Gray (2001)
- Goal. To generate evidence for the magnocellular theory of reading disability.
- Hypothesis. If poor readers’performances on the Ternus task are worse than good
readers’performances, then there will be support for the magnocellular theory.
Method - Ternus task. Two “three squares” displays appear for 50 ms each. The Inter-stimulus
intervals (ISIs) increase in multiples of 16.67ms.
- The participant must say if the squares are “dancing” (group movement) or “jumping”
- Test DOES measure magnocellular processing
- People with dyslexia will see shift from dancing to jumping in a different spot than
- For the test, they found two groups of readers (good and poor)
• Good readers are systematically better than poor readers and it his highly significant
• Measured several factors: word identification, non word reading, irregular word
reading, regular word reading)
• They showed age wise that everybody was similar
- The results show that good readers can see a change from element movement to group
movement at earlier ISIs than poor readers.
- The results support the magnocellular theory of dyslexia.
- Multiple studies show, however, that when analyzed in combination with other well-
known reading predictors (such phonological processing), magnocellular processing
explains little variance (Huslander et al., 2004).
- Thus, at the present time, the magnocellular theory cannot be applied to help those with
- Not worth the investment
- Basic researchers are happy because learned something. However, globally for applied
research it doesn’t say much. Phonological processing is still important.
Important considerations in developing measurements
- Operationalization (of theoretical constructs): To establish a clear relationship between
the theoretical construct and its empirical basis in the operations producing the data
(between the theoretical construct and its measurement) - Measurement: the assignment of numerals to objects or events according to rules
(Stevens, 1946, p. 677).
- Four important questions:
• Does the measurement instrument have the potential to measure a given object?
• How precise is the measurement? (scale)
• How consistent is the measurement? (reliability)
• How strong is the relation between the measurement and the theoretical construct?
- Calibration refers to the process of determining the relation between the output of a
measuring instrument and the value of the input
- Can the measurement technique (or instrument) measure what it is designed to measure?
- Ex: Creating an exam for the first midterm.All questions look like “What is the definition
of X?”And the options are the definition and then random facts and meaningless. The
score for students will be very high, 100% and there is no variability. OR you can do an
exam and calibrate for students at PHD level and ask open-ended questions with precise
answers. Our class could recognize but we just couldn’t answer that level of questions.
One way is too easy and one way too hard. You want to calibrate so there is room to
move to the top end of the spectrum and to the bottom end of the spectrum. You need to
get the full range so that some people can show knowledge and you can see who didn’t
• You want to avoid the “ceiling effect” and the “floor effect”
K-BIT test examples
- Car and truck (go together)
- For dice, the answer is white 6 (using multiplication and division)
Measuring the speed of light - Galileo (early 17 century) was the first scholar who attempted to measure the speed of
• One person on one hill, and one person on the other hill. Both people have an oil
lamp with a shutter. You know the distance between the two points. Open shutter and
sidekick will have stopwatch when it opens. When the other guy opens his shutter, the
first guy closes his shutter. The timing happens and speed of light is calculated.
• Problems: Error will overwhelm any result that you can make. Light travels very fast
and you will need a greater distance. You will need a more precise measurement. Idea
works in principle but it is poorly calibrated.
- Ole Roemer (1670’s) found a sufficiently sensitive measure to estimate the speed of light.
• He obtained 300,000 kilometers per second. This value is impressively close to the
Post-scriptum on Galileo’s idea
- In the early 20th century, technology became sufficiently advanced to measure distance
using electromagnetic energy.
Floor and ceiling effect
- Floor effect: The clustering of scores at the low end of a measurement scale.
- Ceiling effect: The clustering of scores at the high end of a measurement scale.
- Agood test allows for widest range of scores between the minimum score and the
- Ceiling effect: Easiest example= car and truck
- Floor effect: harder example = dice
2- Scales of measurement
- Measurement exists in a variety of forms that can be categorized into “scales”.
- The scales are determined both by the empirical operations invoked in the process of
"measuring" and by the formal (mathematical) properties of the scale
- Ascale of measurement in which the categories represent a qualitative difference in the
variable being measured. Examples
• Gender: Men and women
• Country of Canada, USA, Mexico…
• Color of Brown, blue, green…
- Ascale of measurement in which the categories have different names and are organized
- You can demonstrate in arithmetic you can show that 4 is bigger than 3 in the same way
that 2 is bigger than 1. You CAN’T do this in nominal scale. You can’t say the distance
between each step in your nominal scale is equal
- Two summary descriptive you can use are: You should use the Mode (most occurring
number) and Median (you can know what the middle