PSYC 1000 Chapter Notes -Lewis Terman, David Wechsler, Normal Distribution
12 views5 pages
Course: PSYC*1000 (DE)
Professor: Harvey Marmurek
Schedule: Summer, 2012
Textbook: Psychology – Tenth Edition in Modules authored by David G. Myers
Textbook ISBN: 9781464102615
Module 30: Assessing Intelligence
The Origins of Intelligence Testing
When and why were intelligence tests created?
Plato – no two persons are born exactly alike; but each differs from the other in natural endowments, one being
suited for one occupation and the other for another. Francis Galton had a fascination with measuring human traits.
Charles Darwin proposed that nature selects successful traits through the survival of the fittest. Galton wondered if
it might be possible to measure “natural ability” and to encourage those of high ability to mate with another.
Although Galton’s quest for a simple intelligence measure failed, he gave us some statistical techniques that we still
use. Hereditary Genius illustrates an important lesson from both the history of intelligence research and the history
of science: Although science itself strives for objectivity, individual scientists are affected by their own assumptions
Alfred Binet: Predicting School Achievement
[Binet: Some recent philosophers have given their moral approval to the deplorable verdict that an
individual’s intelligence is a fixed quantity one which cannot be augmented. We must propose and act against this
France passed a law requiring that all children attend school. The French government hesitated to trust
teachers’ subjective judgments of children’s learning potential. To minimize bias, France’s minister of public
education in 1904 commissioned Alfred Binet and others to study the problem.
Binet and Theodore Simon began by assuming that all children follow the same course of intellectual
development but that some develop more rapidly. Their goal became measuring each child’s mental age, the level
of performance typically associated with a certain chronological age. Theorized that mental aptitude is a general
capacity that shows up in various ways. Binet and Simon made no assumptions concerning why a particular child
was slow, average, or precocious. Binet leaned toward an environmental explanation, believing his intelligence test
did not measure inborn intelligence as a meter stick measures height.
What did Binet hope to achieve by establishing a child’s mental age?
Binet hoped that by determining a child’s mental age, or the age that typically corresponds to his or her level of
performance, he could help that child to be placed appropriately in school classrooms with others of similar abilities.
Lewis Terman: The Innate IQ
Lewis Terman found that the Paris-developed questions and age norms worked poorly with California
schoolchildren. Adapting some of Binet’s original items, adding others, and establishing new age norms, Terman
extended the upper end of the test’s range from teenagers to “superior adults.” William Stern derived the famous
intelligence quotient, or IQ. IQ = mental age / chronological age x 100. IQ works fairly well for children but not
adults. Terman promoted the widespread use of intelligence testing. His motive was to take account of the
inequalities of children in original endowment by assessing their vocational fitness. In sympathy with eugenics – a
much-criticized 19th century movement that proposed measuring human traits and using the results to encourage
only smart and fit people to reproduce – Terman envisioned that the use of intelligence tests would ultimately result
in curtailing the reproduction of feeble-mindedness and in the elimination of an enormous amount of crime,
pauperism, and industrial inefficiency. Terman came to appreciate that test scores reflected not ony people’s innate
mental abilities but also their education, native language, and familiarity with the culture assumed by the test.
What is the IQ of a 4-year-old with a mental age of 5? 125 (5 / 4 x 100 = 125)
Modern Tests of Mental Abilities
What’s the difference between achievement and aptitude tests?
Basic reading and math skills, course exams, intelligence tests, driver’s exams… such tests as either
achievement tests, intended to reflect what you have learned, or aptitude tests, intended to predict your ability to
learn a new skill. Meredith Frey and Douglas Detterman – total scores on the US SAT correlated +.82 with general
intelligence scores in a national sample of 14- to 21-year-olds. Course exam – achievement test; college entrance
exam – predict ability to do the work – aptitude test.
David Wechsler created Wechsler Adult Intelligence Scale (WAIS) with a version for school-aged children
(WISC) and include:
•Similarities – reasoning the commonality of two objects or concepts, such as “In what ways are
wool and cotton alike”
•Vocabulary – naming pictured objects, or defining words - what is a guitar
•Block design – visual abstract processing, such as “using the four blocks, make one just like this”
•Letter-number sequencing – on hearing a series of numbers and letters, repeat the numbers in
ascending order, and then the letters in alphabetical order: “R-2-C-1-M-3”
It yields not only an overall intelligence score, as does the Stanford-Binet, but also separate scores for verbal
comprehension, perceptual organization, working memory, and processing speed. Striking differences among these
scores can provide clues to cognitive strengths or weaknesses that teachers or therapists can build upon
An employer with a pool of applicants for a single available position is interested in testing each applicant’s potential
as part of her selection process. She should use an aptitude test. That same employer wishing to test the
effectiveness of a new, on-the-job training program would be wise to use an achievement test.
Principles of Test Construction
What are standardization and the normal curve?
To be widely accepted, psychological tests must meet three criteria: They must be standardized, reliable, and valid.
The Stanford-Binet and Wechsler tests meet these requirements.
Comparing your score with the sample’s scores to determine your position relative to others.
Group members’ scores typically are distributed in a bell-shaped pattern that forms the normal curve. No
matter what we measure, people’s scores tend to form this roughly symmetrical shape. On an intelligence test, we
call the midpoint, the average score.
To keep the average score near 100, the Stanford-Binet and Wechsler scales are periodically
restandardized. College entrance aptitude scores were dropping during the 1960s and 1970s and intelligence test
performance was improving – Flynn effect (James Flynn) who first calculated its magnitude. Average person’s
intelligence test score in 1920 was only 76 – cause is a mystery. The higher 20th century birthrates among those
with lower scores would shove human intelligence scores downward – cross-breeding?
What are reliability and validity?
Reliability – it yields dependably consistent scores. Must use same test or split test in half. If two scores generally
agree, or correlate, the test is reliable. The higher the correlation between the test-retest or the split-half scores, the
higher the test’s reliability. WAIS and WISC reliabilities of +.9.
High reliability does not ensure a test’s validity – the extent to which the test actually measures or predicts what it
promises. (Measuring people’s heights – reliable (consistent) but low validity. Content validity – test taps the
pertinent behaviour (or criterion) – driver’s test, course exams. Expect intelligence tests to have predictive validity –
predict the criterion of future performance.
General aptitude tests are not as predictive as they are reliable. Predictive power of aptitude tests is fairly
strong in early school years, but later weakens. Intelligence scores correlate even more closely with scores on
achievement test (+.81). The correlation with graduate school performance is an even more modest but still
[Diminishing predictive power – Let’s imagine a correlation between football’s linemen’s body weight and
their success on the field. Note how insignificant the relationship becomes when we narrow the range of weight to
280 to 320 pounds. As the range of data under consideration narrows, its predictive power diminishes.]
What are the three criteria that a psychological test must meet in order to be widely accepted? Explain.
A psychological test must be standardized (pretested on a similar group of people), reliable (yielding consistent
results), and valid (measuring what it is supposed to measure).
The Dynamics of Intelligence