PSY 2116 Lecture Notes - Lecture 4: Classical Test Theory, Item Response Theory, Abraham De Moivre

Chapter 4 – Reliability
Conceptualization of Error
- Instead of rigid yardsticks, researchers use “rubber yardsticks”…these may stretch
overtime and shrink (measure characteristics)
- Serious measurement error occurs in most physical, social, and biological sciences
Spearman’s Early Study
- Abraham De Moivre introduced the basic notion of sampling error
- Karl Pearson developed the product moment correlation
- Reliability = these 2 concepts put together
- Cronbach developed methods for evaluating many sources of error in behavioural
- Item response theory has taken advantage of computer tech to advance
psychological measurement
oBuilt on ideas of Spearman
Basics of Test Score Theory
- Di0erence between true score and observed score results from measurement error
- In symbolic representation, the observed score (X) has 2 components; a true score (T)
and error component (E)
oX = T + E or X – T = E
- Major assumption in classical test theory = errors are random
- Might not want to rely on single observation b/c it might fall far from the true score
- Dispersions around the true score tell us how much error there is in the measure
- Classical test theory uses the standard deviation of errors as the basic measure of
oStandard error of measurement
Domain Sampling Model
- Considers problems created by using a limited # of items to represent a larger and
more complicated construct
- Greater # of items, higher the reliability
- Reliability can be estimated from the correlation of the observed test score with the
true score
oTrue score is rarely possible to 9nd
Item Response Theory
- Classical test theory is turning away…
oRequires that exactly the same test items be administered to each person
If intelligence were being tested, some items may be too easy, too hard
- Some downsides, IRT needs items that have been systematically evaluated for
Models of Reliability
- Most reliability coe<cients are correlations, but sometimes it is more useful to use
math equivalent ratios
oRatio of the variance of the true scores to variance of observed scores
- Must use Greek letter instead of S because equation describes values in a population
rather than from a sample
Sources of Error
- Test reliability is estimated in 3 ways
oTest-retest method: consistency of the test results when the test is
administered on di0erent occasions
oParallel forms: evaluate the test across di0erent forms of the test
