false

Textbook Notes
(369,074)

Canada
(162,369)

Western University
(16,157)

Psychology
(4,929)

Psychology 2080A/B
(62)

Prof
(13)

Chapter 4

Description

Test and Measurement: Chapter Four
Reliability
The word error does not imply that a mistake has been made, error implies
that there will always be some inaccuracy in our measurements
Tests that are relatively free of measurement error are deemed to be reliable
Spearman’s Early Studies
In 1733 Abraham De Moivre introduced the basic notion of sampling error
Karl person developed the product moment of correlation
Cronbach and his colleagues made a major advance by developing methods
for evaluating many sources of error in behavioural research
Basic of test score theory
Classical test score theory assumes that each person has a true score that
would be obtained if there were no errors in measurement
The score observed for each person almost always differs from the persons
true ability or characteristic
A major assumption in classical test theory is that errors of measurement are
random
Basic sampling theory tells us that the distribution of random errors is bell-
shaped
The dispersions around the true score tell us how much error there is in the
measure. Classical test theory assumes that the true score for an individual
will not change with repeated applications of the same test
Standard error of measurement: an index of the accuracy of a regression
equation. It is equivalent to the standard deviation of the residuals for a
regression analysis. Prediction is most accurate when the standard error of
estimate is small
The standard error of measurement tells us on the average how much a score
varies from the true score
The Domain Sampling Model
This model considers the problems created by using a limited number of
items to represent a larger and more complicated construct
This model conceptualizes reliability as the ratio of the variance of the
observed score on the shorter test and the variance of the long-run true
score
Finding the true scores is not practical and is rarely possible
To estimate reliability we can create many randomly parallel test by drawing
repeated random samples of items from the same domain
Classical test theory requires that exactly the same test items be
administered to each person
Using Item Response Theory the computer is used to focus on the range of
item difficulty that helps asses individuals ability level The overall result is that more reliable estimate of ability is obtained using a
shorter test with fewer items
Models of reliability
The reliability coefficient is the ratio of the variance of the true scores on a
test to the variance of the observed scores
The equation describes theoretical values in a population rather than those
actually obtained from a sample (that’s why S^2 is not used the Greek symbol
theta is)
Example: the reliability of a test is .40, when the employer gets the test back
and begins comparing applicants, 40% of the variation or difference among
the people will be explained by real differences among people and 60% must
be scribed to random or chance factors
Sources of Error
An observed score may differ from a true score, there may be situational
factors such as loud noises in the room OR the items on the test might not be
representative of the domain
Test reliability is usually estimated in one of three ways
In the test re-test method we consider the consistency of the test results
when the test is administered on different occasions
Using the method of parallel forms we evaluate the test across different
forms of the test
With the method of internal consistency we examine how people perform on
similar subsets of items selected from the same form of the measure
Time Sampling: the test re-test method
Is used to evaluate the error associated with administering a test at two
different times
Test-retest reliability is relatively easy to evaluate: just administer the same
test on two well-specified occasions and the correlation between scores from
the two administrations
The carryover effect- this effect occurs when the first testing session
influences scores from the second session
The test re-test correlation usually overestimates the rues reliability
In cases where the changes are systematic, carryover effects do not harm the
reliability
If something affects all the test takers equally then the results are uniformly
affected and no net error occurs
Practice effects are one important type of carryover effect
A well evaluated test will have many retest correlations associated with
different time intervals between testing sessions Item Sampling: parallel forms method
Building a reliable test also involves making sure that the test scores do not
represent any one particular set of items or a subset of items from the entire
domain
One form of reliability analysis is to determine the error variance that is
attributed to the selection of one particular set of items
Parallel forms reliability: compares two equivalent forms of a test that
measure the same attribute. The two forms use different items; however the
rules used to select items of a particular difficulty level are the same
When both forms of the test are given on the same day the only sources of
variation are random error and the difference between the forms of the test
Split-Half Method
In split half reliability, a test is given and divided into halves that are scored
separately
The two halves of the test can be created in a variety of ways
The best method is to divide the items randomly
Odd-even system- the score is obtained for the odd-even numbered items in
the test and another for the even-numbered items
An estimate of reliability based on two half-tests would be deflated because
each half would be less reliable than the whole test
o To correct for half length you can apply the Spearman-Brown
Formula
R is the correlation between the two halves of the test
When the two halves of the test have unequal variances, Cronbach’s
coefficient alpha can be used
o One of the problems with this is that skew-ness can affect the average
correlation among the items
When responses of the items are not normally distributed the alpha
coefficient can be greater t

More
Less
Unlock Document

Related notes for Psychology 2080A/B

Only pages 1 and half of page 2 are available for preview. Some parts have been intentionally blurred.

Unlock DocumentJoin OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.