Textbook Notes (270,000)
CA (160,000)
UTSC (20,000)
Psychology (10,000)
PSYC37H3 (100)
Chapter 4

Psychological Assessment - Chapter 4 Book Notes

Course Code

This preview shows pages 1-3. to view the full 12 pages of the document.
Discrepancies between the true ability and measurement of ability form errors of measurement
Error means that there will always be some inaccuracy in our measurements
Tests that are relatively free of measurement error are said to be reliable
History and Theory of Reliability
Spearmans Early Studies
Charles Spearman is responsible for the advanced development of reliability assessment
Moivre introduced the basis notion of sampling error
Pearson development the product moment correlation
Spearman worked out most of the basics of contemporary reliability theory
Spearmans article attracted Thorndike
Item response theory (IRT): uses computer technology to advance psychological measurement
IRT is built on many of the ideas that Spearman introduced
Basics of Test Score Theory
The observed score for each person always differs from the persons true ability
The difference is the measurement error
A major assumption of the classical test theory is that errors of measurement are random
Basic sampling theory: tells us that the distribution of random errors is bell-shaped
The center of the distribution represents the true show

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

The dispersion about the mean of the distributions display the distribution of sampling errors
The true score for an individual will not change with repeated applications of the same test
Standard deviation = standard error of measurement
The standard error of measurement tell us, on average, how much a score varies from the true
The standard deviation of the observed scored and the reliability of the test are used to estimate
the standard error of measurement
The Domain Sampling Model
Domain sampling model: considers the problem created by using a limited number of items to represent a
larger and more complicated constract
The error is due to the sample of items
As the sample gets larger, it represents the domain more and more accurately
The greater the number of items, the higher the reliability
Each item on the test should represent the studied ability
Reliability can be estimated from the correlation of the observed test score with the true score, but
true scores are not available
Our only alternative is to estimate that the true score is
Different random samples of items might give different estimates of the true score due to
sampling error
To estimate reliability, we create many randomly parallel tests by drawing repeated random
samples of items from the same domain
If we create many tests from sampling, we should get a normal distribution of unbiased estimates
of the true score
Then we would find the correlation between each tests and each of other tests, and the
correlations then would be averaged
Item Response Theory

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

A growing movement is turning away from classical test theory because:
Classical test theory requires the exactly same items to be administered to each person
For a trait, such as intelligence, a small number of items concentrate on an individuals level of
IRT: uses the computer to focus on the range of item difficulty that helps assess an individuals ability
If the person gets a few easy items correct, the computer might move to the more difficult items
Then, this level of ability is intensely sampled
A more reliable of estimate of ability is obtained using a shortest test with fewer items
The method requires a bank of items that have been evaluated for level of difficulty
Complex computer software is required
Models of Reliability
Reliability coefficient: is the ratio of the variance of the true score to the variance of the observed scores
Describes theoretical values in a population rather than those obtained from a sample
% of the observed variance that is attributable to variation in the true score
If we subtract this ration from 1.0, we have the % of variation attributable to random error
Sources of Error
An observed score differs from a true for reasons such as:
Situational factors such as loud noises
The room may be too hot or too cold
The items on test might not be representative of the domain
Suppose you could spell 96% of the words in the English language correctly, but the 20-item
spelling test you took included 5 items (20%) that you could not spell
You're Reading a Preview

Unlock to view full version