Textbook Notes (369,133)
Canada (162,403)
Psychology (2,981)
PSY100H1 (1,831)
Chapter 7

PSYC32 Chapter 7 Textbook Notes.docx

14 Pages

Course Code
Michael Inzlicht

This preview shows pages 1,2,3. Sign up to view the full 14 pages of the document.
Chapter 7: Tests of Intellectual Abilities Case Study • Vera, referred by school counsellor to be eval’d for ADHD, learning disorder, etc • Had a TBI (closed head injury to left hemi) at age of 4, currently 14 • Difficulty walking on the right side, mem skills somewhat impaired, more easily upset • Starting in kindergarten, she was put in special edu and given an IEP • She did so well in special edu though that she was bored and so npsy eval to see if continue • It seems that ADHD is more her problem and though she is a bit less than avg, she is not truly mentally impaired (i.e. her test scores fell in the grey area between “normal” and “brain impaired” and so, integrated back into regular classes (which is what Vera wanted) • This follows the intentions of PL-94-142 that says a child should be placed in the least restrictive learning enviro Psychometric Theory • Fxn of psychometrics has traditionally been either: o 1-to measure diffs between individuals o 2-to measure diffs btwn abilities of same person under diff circumstances • Main purproses in cnpsy = o Assess strengths and weaknesses of a client after accident/illness o Develop a diagnosis o Formulate a rehab treatment plan • The following principles apply regardless of the use of the particular test: Test Construction • A psy test = objective + standardized measure of a sample of behvs • Purpose of the test must be clearly articulated so that item choice can be related to purpose • ~ is a long and detailed procedure, more fully described by Anastasi + Urbina • Step 1 = define the domain (the type of behv the researcher is looking at) o i.e. domain of IQ test = all behvs included within researcher’s def’n of intelligence • Step 2 = generating test items to measure the concept (i.e. intelligence) • Step 3 = give the test to many people. Add or delete test items as needed, based on usefulness o A factor analysis is often helpful to determine degree to which each item is related to the construct that the test hopes to measure • Step 4 = Standardization: prep a manual w/ specific admin instructions (to rid inadvertent error that could be introduced by examiner or the conditions under which test is admin’d + scored) o Norms are also established at this stage (by admin’ing the test to a group that fits the rational for the test’s goal (i.e. if you want to generalize to everyone, you need a large sample stratified by age, gender, ethnicity, etc.) • Step 5 = After developing norms, you must define scoring based on these. (A person’s score has no intrinsic value but needs to be evaluated compared to other ppl’s scores) o Raw score  scaled score (so that it can be used in stat calc’ns) • An example of a conversion is the standard score derived from WISC-IV, in case study • The conversion occurs w/ a mean (x¯) of 100, and standard dev (sd = measure of the spread of scores) of 15. (this is done bc the converted scores will follow the normal curve = normal spread of scores around the mean) • APA in conjunction w/ the American Edu Research Assoc + National Council on Measurement in Edu have formulated Standards for Edu and Psy Testing which define the info (including stat norms etc.) that must be included when developing tests, fairness in testing, testing applications Reliability • 2 concepts within test construction = reliability + validity • ~ = consistency of scores of test/retest for same individual when nothing has changed • A measure of ~ allows one to estimate what proportion of total variance (= diff btwn test scores) is bc of error variance (ev = not ue to errors in taking tests, but is bc of inherent characteristics within the test) • ~ is rep’d as a correlation coeff (= extent to which 2 variables vary together) o Reputable test will be roughly in the +0.80s • Types of ~: • Test-retest ~ = repeating identical test on two occasions • Alternate-form ~ = same person taking two equivalent tests on two occasions o Corr btwn the two forms = reliability coeff o These parallel forms (= alt, equivalent forms) must be eval’d to make sure they truly rep the same domain • Split-half reliability = divide the test into equal portions and corr the two portions together o Could do odd vs. even, first half vs. second half (based on a planned split which is designed to yield equivalent sets of items) • Interitem consistency = req’s only one administration of the test and is based on consistency of responses for all items = most common method to measure this was dev’d by Kuder + Richardson = it is based on performance of each time bc the Kuder- Richardson reliability coeff is the mean of all possible split-half coeffs (so, unless the items are highly homogenoush, the K-R coeff < split-half coeff) and split-half - KR = rough index of heterogeneity of the test Validity • ~ = how well test measures what it intends to measure (i.e. relationship btwn performance on the test and the behv construct it was intended to measure) • ~ = more critical to test construction than reliability.(and can only be valid if already reliable) • Also expressed w/ corr coeff (corr btwn test score and a criterion measure is the standard way) • Types of ~: • Content validity = systematic analysis of the actual test items to determine the adequacy of the coverage of the behv being measured. Involves two decisions: o 1- are the items appropriate for the content area? o 2-are there enough of various types of items to sample the behv? o Thus, ~ may suffer from subjective factors (bias) in researcher’s choice of items • Face validity = diff from ~. Not a corr btwn test and any other measure. Instead, the appearance of the test to a test taker (i.e. what does the examinee think is being measured?) o Very high face validity could lead a test taker to be less truthful o i.e. Beck Depression Inventory II is face valid and if pt does not want ppl to know they are depressed, they may deliberately answer untruthfully and so the test is inaccurate • Criterion validity = effectiveness of the test in predicting other things (i.e. real life outcomes, performance on other things, behv, etc.). the specific other criteria will vary depending on the test variable in question. i.e. IQ test can be validated against GPA • Construct validity = first discussed by Cronbach + Meehl = extent to which test is able to measure a theoretical construct/trait (examples of constructs = processing speed, attention, concept formation). Each concept is dev’d to explain and organize observable response consistencies. i.e. corr’s btwn new vs. old tests of the same construct. Factor analysis can also be used to determine what variables are contributing to the variance btwn scores. • Measures of internal consistency use the total score of the test as the criterion and eval each item against the total. • Campbell said that in order to show construct validity, a test must demonstrate that the test corr’s highly w/ other variables with which it theoretically should = convergent validity • Discriminant validation = the test should also not corr w/ variable w/ which it should not • Campbell + Fiske devised a systematic expt’l design for the direct eval of convergent and discriminant validation = multitrait-multimethod matrix • Since Cronbach + Meehl, testing needs and terminology have changed o Though construct validity was first intro’d as a separate type of validity, it has moved in some models to encompass all validity o In other models, construct validity = redundant way to just say “validity” bc all types of validity ultimately conform to the construct measured by the test • Meta-analysis = being used more often now since its intro in the ‘70s = can be used as a substitute for the traditional lit search by test users and developers (bc large database of avail research on current tests) = weigh the findings of several studies on the basis of methodological and substantive features. Takes effect size(= degree of stat sig in the data, or the size of the corr) into account. Allows researcher to gain info from previous studies w/o doing a new study Ethics in Testing + Assessment • Three documents exist that specify the guidelines for construction + dev + use of tests: • 1-APA’s Ethical Principles of Psy’ists and Code of Conduct (2005) = competence, privacy, confidentiality, assessment. Collectively, it elaborates on two issues: o 1) technical standards for dev of tests (= used to eval the psychometric properties) o 2) standards for the professional use of tests (= psy’ists are personally + professionally responsible for safeguarding the welfare of consumers of psy tests) • 2-APA + American Edu Research Assoc + National Council on Measurement in Edu ‘s Standards for Edu and Psy Testing (1999) • Discusses technical + professional standards for construction, eval, interpret., and app of tests • 3-International Testing Committee’s Intnt’l Guidelines for Test Use (2000) • Overlaps the other two documents to a great extent • Includes the need for high standards when dev’ing and using tests + need for professional behv by test administrator History of Intellectual Assessment • ~ has a lot of overlap w/ the history of brain sciences in general (similar time periods + issues) • Earliest account of use of assessment = Chinese Empire o For 200 yrs, they used assessment for civil service examinations • Greeks used testing within their edu sys (physical + intellectual) • Middle Ages universities used testing to award degrees, honours th • 19 century = burgeoning period for dev of tests bc of world events + dev in the field of psy o Mental hygiene movement = patients w/ mental illness were separated from criminals  This was helped by Kraeplin who dev’d diagnostic classification sys  It also became imp to determine literacy of patients and criminals (to determine who would benefit from treatment) o Need of school systems to know lvl of students (began in France,  Europe  US)  Bc not enough space in schools, not enough teachers, etc.  And special edu vs. normal vs. gifted  Both for separating current skills +predicting future benefits from type of edu th th • End of 19 /beginning of 20 : types of abilities necessary for soldiers were changing bc tech o Literacy, strengths vs. weaknesses for which type of soldier they should be, PTSD  Veteran’s Admin has recently recog’d large # of soldiers returning from Iraq + Afghanistan w/ brain injuries due to the type of warfare and has given $ to help • Esquirol in 1838 was the first to distinguish two ppl based on tests, classified types of retardation in over 100pgs of a 2-vol text. Made many attempts to dev a system and eventually said use of lang was most dependable criteria for intellectual lvl • Seguin (like Esquirol was French) = tried to educate intellectually impaired ppl o Focused on sense discrim + motor control o Major contribution = dev’d nonverbal intelligence tests, like the Seguin Form Board • Sir Galton= related to Darwin = main interest = heredity o began the testing movement, measured many sensorimotor fxns (i.e. muscular strength, rxn time, visual + hearing abilities) and dev’d many tests for these. o Thought intelligence = best sensory abilities o Pioneered questionnaire + rating scale methods o Credited for statistal methods for individ diffs w/ Pearson (i.e. Pearson’s product-moment corr coeff) • Cattell = American who worked under Wundt in Leipzig o Established psy labs in the USA when he came back. Also increased the use of testing materials and was the first to use the term “mental test” o Admin’d to college students and had tests w/ items similar to those of Galton • (Binet + Henri, 1895) = criticized available tests as too sensorimotor-focused o Proposed intelligence = mem, attn, imagination, aesthetic, and others • 1904: French Minister of Public Institutions appointed Binet to the Commision for the Retarded to study the edu of retarded children. Collab’d w/ Simon to dev Binet-Simon Scale o Dev’d in 1905 w/ 30 items, arranged in order of difficulty o First intelligence test to cover a multitude of abilities • They made a second Binet-Simon scale in 1908 w/ some additions and deletions, and also all the tests were grouped by age • Third Binet-Simon in 1911 (the year after Binet’s death, when many ppl tried to improve his test), the most famous = Stanford-Binet in 1916 by Terman • Current version = 5 • Wechsler = American, viewed the Binet scales as too restrictive and unrepresentative o 1939 = intelligence = abilities to think rationally, act purposefully, and deal effectively w/ the enviro (thus he wanted to include performance components, instead of just verbal ones). His scale is now the most frequently used intelligence scale. • Individual intelligence test = one
More Less
Unlock Document

Only pages 1,2,3 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.