Sept 24 PSYC37

University of Toronto St. George
Psychology
PSY100H1
Faye Mishna
Fall

Sept 24 Psychological assessment PSCY37 Reliability and Validity:  Reliability: Examines which sources are responsible for the error to occur.  Refers to the degree to which the test scores differ.  Classical Test Theory: The observed score is assumed to be equal to some true score+ Error.  Reliability: some tests get closer to measuring true scores than others; the more reliable they are the closer we are towards the true score. (minimizing the error)  Assumption: errors of measurement are random.  Opposed to the systematic (everybody gets affected to the noisy envi.)  Rubber yardstick analogy: It would stretch and expand randomly. If it was 2 inches all the time it would be systematic. Since the errors are randomly they would occur frequently and therefore it would be a normal curve. Hence, random measurement error will always follow a normal curve/ distribution  Sampling theory suggests that the distribution of random errors is bell- shaped. The idea is that if you took many measurement since there would lots of instances and end up looking like normal curve.  Degree of spread- reflects the amount of sampling error  IMP The standard deviation of errors is  when a curve is much skinny than other suggests that the amount of error is smaller vs. a fatter curve.  Standard Dev (how much a score varies from the mean) Std Error (error in measurement - mot individual scores)  The Central Q: What prop of the variation in observed test scores can be attributed to true score variation VS. error variation?  Pie chart 1- 10% is accounted by the error variation (Reliable)  2- 35% by error variation  True Var- differences amongst individual scores  Error Var- variation due to other errors (envi)  The Domain Sampling Model-  Neuroticism- a personality trait characterized by -ve states, anxiety, instability, etc  Problems that the test creators face:  BFI captures by 8 items  On BFI 10 captured 2 items  How reliably can these tests capture such complex traits?  Models of Reliability:  3 main approaches:  Test- Retest  Parallel forms  Internal consistency (most popular but has its limitations)  Test- Retest  Administer the same test to the same group at 2 diff times.  Examines the temporal stability- expecting that the scores are stable with the retest.  Not the right approach to take if you expect the fluctuation or change  Basic Calculation: determine a coefficient for each score  What are some factors to consider?  Age  Health concerns  In a setting where people might leave frequently (if someone leaves a job)  Time Interval between tests (still reliable after a month)  Carryover effects- to what extend does doing the test the second time would affect the scores.  Charecteristics of the participants themselves  Alternate/ parallel forms- the idea that you have 2 equivalent version of the test. You have conducted a research prior that allows you to create 2 diff versions of the same thing.  Time is no longer is the source of variabiulity since they will be conducted simultaneously  However, Item sampling diff may occur as a source of error- diff in the items in each version (the way their worded may affect how hard they are)  Split- Half Reliability  One version of the test that is finished in 1 sitting- the same test is split in one version  Even items to one score and the odds  The risk of taking first and second half- if the second half is harder than the first the scores would differ and lower the reliability.  The risk of Examining the cor
