Textbook Notes (369,074)
Canada (162,369)
Psychology (4,929)
Prof (13)
Chapter 4

2080B Chapter 4.docx

6 Pages

Course Code
Psychology 2080A/B

This preview shows pages 1 and half of page 2. Sign up to view the full 6 pages of the document.
Test and Measurement: Chapter Four Reliability  The word error does not imply that a mistake has been made, error implies that there will always be some inaccuracy in our measurements  Tests that are relatively free of measurement error are deemed to be reliable Spearman’s Early Studies  In 1733 Abraham De Moivre introduced the basic notion of sampling error  Karl person developed the product moment of correlation  Cronbach and his colleagues made a major advance by developing methods for evaluating many sources of error in behavioural research Basic of test score theory  Classical test score theory assumes that each person has a true score that would be obtained if there were no errors in measurement  The score observed for each person almost always differs from the persons true ability or characteristic  A major assumption in classical test theory is that errors of measurement are random  Basic sampling theory tells us that the distribution of random errors is bell- shaped  The dispersions around the true score tell us how much error there is in the measure. Classical test theory assumes that the true score for an individual will not change with repeated applications of the same test  Standard error of measurement: an index of the accuracy of a regression equation. It is equivalent to the standard deviation of the residuals for a regression analysis. Prediction is most accurate when the standard error of estimate is small  The standard error of measurement tells us on the average how much a score varies from the true score The Domain Sampling Model  This model considers the problems created by using a limited number of items to represent a larger and more complicated construct  This model conceptualizes reliability as the ratio of the variance of the observed score on the shorter test and the variance of the long-run true score  Finding the true scores is not practical and is rarely possible  To estimate reliability we can create many randomly parallel test by drawing repeated random samples of items from the same domain  Classical test theory requires that exactly the same test items be administered to each person  Using Item Response Theory the computer is used to focus on the range of item difficulty that helps asses individuals ability level  The overall result is that more reliable estimate of ability is obtained using a shorter test with fewer items Models of reliability  The reliability coefficient is the ratio of the variance of the true scores on a test to the variance of the observed scores  The equation describes theoretical values in a population rather than those actually obtained from a sample (that’s why S^2 is not used the Greek symbol theta is)  Example: the reliability of a test is .40, when the employer gets the test back and begins comparing applicants, 40% of the variation or difference among the people will be explained by real differences among people and 60% must be scribed to random or chance factors Sources of Error  An observed score may differ from a true score, there may be situational factors such as loud noises in the room OR the items on the test might not be representative of the domain  Test reliability is usually estimated in one of three ways  In the test re-test method we consider the consistency of the test results when the test is administered on different occasions  Using the method of parallel forms we evaluate the test across different forms of the test  With the method of internal consistency we examine how people perform on similar subsets of items selected from the same form of the measure Time Sampling: the test re-test method  Is used to evaluate the error associated with administering a test at two different times  Test-retest reliability is relatively easy to evaluate: just administer the same test on two well-specified occasions and the correlation between scores from the two administrations  The carryover effect- this effect occurs when the first testing session influences scores from the second session  The test re-test correlation usually overestimates the rues reliability  In cases where the changes are systematic, carryover effects do not harm the reliability  If something affects all the test takers equally then the results are uniformly affected and no net error occurs  Practice effects are one important type of carryover effect  A well evaluated test will have many retest correlations associated with different time intervals between testing sessions Item Sampling: parallel forms method  Building a reliable test also involves making sure that the test scores do not represent any one particular set of items or a subset of items from the entire domain  One form of reliability analysis is to determine the error variance that is attributed to the selection of one particular set of items  Parallel forms reliability: compares two equivalent forms of a test that measure the same attribute. The two forms use different items; however the rules used to select items of a particular difficulty level are the same  When both forms of the test are given on the same day the only sources of variation are random error and the difference between the forms of the test Split-Half Method  In split half reliability, a test is given and divided into halves that are scored separately  The two halves of the test can be created in a variety of ways  The best method is to divide the items randomly  Odd-even system- the score is obtained for the odd-even numbered items in the test and another for the even-numbered items  An estimate of reliability based on two half-tests would be deflated because each half would be less reliable than the whole test o To correct for half length you can apply the Spearman-Brown Formula  R is the correlation between the two halves of the test  When the two halves of the test have unequal variances, Cronbach’s coefficient alpha can be used o One of the problems with this is that skew-ness can affect the average correlation among the items  When responses of the items are not normally distributed the alpha coefficient can be greater t
More Less
Unlock Document

Only pages 1 and half of page 2 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.