Study Guides
(238,613)

Canada
(115,253)

University of Guelph
(6,820)

Psychology
(919)

PSYC 3250
(16)

Jeffrey Spence
(4)

Midterm

# Midterm 1 Review.docx

Unlock Document

University of Guelph

Psychology

PSYC 3250

Jeffrey Spence

Winter

Description

Midterm
- all content will be tested
- wording will be simple and familiar
- some conceptual, some concrete questions
- minimal calculations
- 1 or 2 questions – a, b, c, or ab, or ac, etc
- formulas – know what it is, why we use it, and when we use, will not need to compute it or identify variables
o Spearman Brown prophecy
o Pearson correlation
o SE or estimate
o “standard score” (customized)
o SE of measurement
o correlation of attenuation
came from two imperfect correlations
boost your reliability by accounting for error
o KR-20/Alpha
- percentile vs. percentile rank
o percentile rank - % of individual falling below a given point in a distribution
o percentile – score that specifies a certain percentile rank (at or below)
o example
percentile rank – 70%
• 70% of people are below the score of the test taker
score of 123 is at the 70% percentile
• 70% of people score at or below 123
o at, below, or at and below – will have different % of people that fall into this
range
- generalizable theory
o CTT: X = T + E
systematic test variance and random measurement error
o GT
non-error test variance, random measurement error, AND systematic test variance
o true score and random error
o figure out contribution of other scores of errors or break up error into multiple components
uses ANOVAto break out the other scores of error
- types of evaluation
o formative
monitor learning, low stakes, development
• e.g., proposal
can get a feedback to help you later one
o summative
evaluate learning, higher stakes, final
• e.g., midterm
feedback will not help you do better on another test
just see how much you know and that’s it
- transformations
o linear
don’t change the shape of the distribution
• i.e., effects all scores equally
o non-linear change
shape of distribution
• i.e., effect scores differently
- converting to standard scores
o two steps process transform data to z scores
use standard score formula: NEW SCORE = NEW MEAN + Z (NEW SD)
• CEEB scores = 500 + z(100)
• IQ scores = 100 + z(15)
- Classical Test Theory
o theory X = T + E
o X
actual observation (data that is collected)
o T
represents test takers
o E
Section 1: Introduction to Measurement
Some Uses of Tests
- job selection, performance appraisal, vocational interests, career management, clinical diagnosis, and educational
assessment medical and health practice
What is Measurement?
- measurement – assignment of meaningful numbers to phenomena
- psychological measurement – assignment of meaningful numbers to psychological phenomena
o trying to measure something with psychological construct
o typical result is that numbers tell us how much of some quality the person being measured has
Psychometrics
- psychometrics – science of psychological measurement
- usually involves measuring things we cannot see
- need to something that we cannot see observable
Psychological Measurement
- “on a scale 1 to 5 scale, where 1 represents not at all and 5 represents extremely, how happy are you?”
o using this item, we have quantified the extent to which someone has an attribute: happiness
your response on a scale is the measure of how happy you are
objective – can be verified by multiple people
- could ask about group membership
o example, are you employed?
1 = employed
2 = unemployed
o use numbers as a convenient way to represent group membership
Basic Concept of Measurement
- measurement consists of rules for assigning symbols to objects so as to
o 1. represent quantities of attributes numberically (scaling)
OR
o 2. define whether the object fall in the same or different categories with respect to a given attribute
(classification)
Some Terminology
- what is the difference between
o psychological test, psychological measure, psychological scale
no difference, they are the same thing
Test, Scale, Measure
- an instrument to assign meaningful numbers to phenomena
o standard tool (same for everyone) for recording information in a standard format (same for everyone)
- standardized test – administered, scored, and interpreted in a standardized manner
o consistency Standardized Measure (not to be confused with standardized data)
- having a standardized measure means
o rules are clear – scoring, administration, interpretation
- measure is practical to apply
- does not demand skill in the tester beyond initial training
- results do not depend on administrator
Example
- assessment of intelligence
o OptionA
examining the scale for bumps
• unstandardized
• to standardize
o assign numbers
o map of the brain and what certain bumps might mean
o try to make it as consistent as possible
o Option B
test kit in a briefcase
• standardized
Advantages of Standardized Measurement
- objectivity
o principle of science is that statements of fact need to be independently verified by other scientists
o having standardized measure make this to be possible
- quantification
o 1. numerical indices can be reported in finer detail than personal judgments
o 2. quantification allows us to use mathematical analyses
- communication
o much of science is built on communication amongst scientists
o findings can be compared with other results
- practical
o despite the fact that standardized measures can take a lot of work to develop, once they are developed they
are very economical in terms of time and money than are subjective evaluations
Observation
- scientists go to great lengths to make things observable
o large Hadron Collider
Assessment
- process of integrating information from different sources using multiple methods
o example
clinical psychologist doing an assessment
• asking questions, multiple written tests, information from school (e.g., grades)
o integration of information from multiple tests, scales, and measures
Types of Psychological Tests
- tests of performance
o tests of maximal performance – assess the upper limits of the examinee’s knowledge and abilities
examples - intelligence tests, mechanical spatial ability, classroom achievement, tests of
professional competence
broken down to – achievement tests and aptitude tests
• achievement tests
o measure mastery of a specific and defined content area
o test takes have received instruction on content area
o tests and exams received in school are an example • aptitude tests
o more general and are tests of accumulated knowledge, skills, and abilities
developed over ones life
o designed to evaluate potential for learning rather than what one has already
learned
o example
Harvard Entrance Exam from 1869 – aptitude-achievement test
both can be thought of as tests of cognitive ability
different in interpretation of results:
• achievement: what has been achieved (past)
• aptitude: what can be achieved (future)
- identify the following tests
o driving test – achievement
o midterm – achievement
o SAT (scholastic assessment test) – aptitude
o self-efficacy – achievement, aptitude, or neither (test of typical behaviour or attitude)
Typical Response Tests
- measure typical behaviour and characteristics of individual
o personality, interests, attitudes, and emotions
- objective and projective tests (personality)
o objective test – items are not influenced by person scoring the test
o select, restricted response options (“strongly agree”, “True/False”)
- closed-ended
- projective tests
o presentation of ambiguous material that can produce a wide range of responses from test takers
o theory – responses are a result of person genuine unconscious motives, desires, etc. which are “projected”
onto the ambiguous stimuli
o example
ThematicApperception Test (TAT)
• “tell a story about what had led up to the event shown or what is happening at the moment”
Introduction to Score Meaning
- norm referenced scores (relative)
o how did other people taking the test averaged, are you above or below the average (mean)
- criterion referenced scores (absolute)
o test is out of 35, then you demonstrated 32 of the 35
o in order to operative a chainsaw, you need to score a 35 and up
Uses of Test
- psychological tests are generally used for two purposes
o 1. research
o 2. make decisions about individuals
- implications and impact of tests in our lives is far reaching
o diagnoses of illness, identification of skills, employment selection, licensing, science, and program
evaluation (policy)
Section 2:AShort History of Psychological Measurement
- psychological measurement
o modern psychological measurement
100 years old
- scientific measurement
o hundreds of years old
History - circumference of the Earth, distance of Earth from Sun, and even the weight of the Earth was determined before the
1800s
- over 100 years before modern psychological tests came into general use
Origins of Psychological Testing
- origins are in Practice/Applied Settings and go back centuries
o 1. civil service examinations
o 2. assessments of academic achievement in schools and universities
o 3. European and American scientists measuring individual differences
Circa 2200 B.C.
- civil service in China
o every 3 years officials were examined to assess if they were still competent to continue serving
o 1115 B.C.
candidates were examined on the proficiency in the “6 arts”
• music, archery, horsemanship, writing, arithmetic, and the rites and ceremonies of public
and private life
o 202 B.C. – 200 A.D.
under the Han dynasty written examinations were introduced in the “5 studies” civil law, military
affairs, agriculture, revenue, and the geography of the Empire
Circa 1300 A.D.
- 1370, Chinese Civil Service exam took their “final form”
- emphasis was placed on remembering and interpreting Confucian classics
- candidates had to pas there (very) competitive exams
o round 1 – offered annually, spend a day and night in a smooth booth and written exam that was judged for
“beauty or penmanship” and “grace of distinction” (district)
o round 2 – offered every 3 years, 3 sessions of 3 days and nights (9 days and nights), written exam: judged for
depth of scholarship (province)
o round 3 – offered the following Spring, at this stage 3% became eligible for public office (capital)
1700-1900 A.D.
- 1791 – written examinations for entry into public service introduced in France (later eliminated by Napoleon)
- 1833 – first use of open competitive examinations occurred in UK
- 1860s – bills were passed in the US to establish an examination system for determining public appointments
University and School Exams
- oral exams –Ancient Greece and Rome
- 1219A.D. – earliest formal examinations were in law at the University of Bologna
- 1441A.D. – Louvain University had competitive examinations; 4 classes: “rigorosi” (honor men), “transbiles”
(satisfactory), “gratiosi” (charity passes), and failures
- 1636A.D. – oral examinations for B.A. and M.A. degrees were introduced in Oxford, England
Circa 1880
- sir Francis Galton (1822-1911) – the Father of the study of individual differences
- 1882 – Galton summarized findings related to the study of individual difference (“Inquires into Human Faculty and its
Development”
- 1884 – Galton opened the “Anthropometic Laboratory” at the International Health Exhibition, where people could
pay to have a series of measurements taking including: height, sitting height, arm span, and weight
Individual Differences
- Galton’s lab amassed quite a lot of data leading to the idea to develop a took to understand relations among variables
(i.e., correlation)
- Karl Pearson (1857-1936), close friend of Galton, held position as professor of applied mathematics and mechanics
and University College, London
o contributions produce moment correlation (r)
multiple correlation
biserial correlation
correction for range restriction
Chi square goodness of fit test
- James McKeen Cattell (1860-1944), introduced Galton style testing to the US
o was an assistant at Galton’s Anthropometric Laboratory
o went on the be prof and the University of Pennsylvania where he was the first person to have the title of
professor of psychology
o established a battery of tests of measure
rate of movement
sensation-areas
pressure causing pain
least noticeable differences in weight
reaction time for around
time for naming colours
- Clark Wissler (1901)
o published elaborate statistical analyses of the relationship among the mental tests and grades
o low correlations
o grades showed correlations with other grades but not with the laboratory tests
- Binet and Simon (1905) liked the idea of an examination that was
o “affected neither by the bad humour nor the bad digestion of the examiner”
Meanwhile
- testing continues in laboratories opened at Harvard, Yale, Clark (Chicago), Wisconsin, and others
Circa 1905
- new era
- invention of individual difference scales
- lead by discoveries of Binet in France and Spearman in England
- Binet and a collaborator Simon (1905) were interested in diagnosing retardation for making decisions about whether
students should be entered into special schools (such special schools were founded in 1859) so they could get special
schooling
- special commission appointed by the minister of public instruction made the recommendation that
o “no child suspected of retardation would be dropped from a regular school and put in special school without
having taking pedagogical and medical examination showing that his mental state was such that he could not
profit, at least moderately, from teaching in the regular school”
(Binet & Simon, 1905)
The 1905 Intelligence Scale
- 1905 – first successful intelligence scale was developed
- separate items, chosen systematically in relation to difficulty and was published with careful instructions for
administration and interpretation
- what used to be a “test” became a “subtest”
- 1905 scale went on to be revised many times…
- 1908 – Binet went on to develop the concept of mental age
Circa 1905
- Charles Spearman (1863-1945)
- 1904 – published “The proof and measurement of association between two things” and “General intelligence,
objectively determined and measured”
o correction for attenuation
o two factor theory of intelligence
- Spearman introduced the concept of reliability but did not use the term reliability until 1907
- identified the unreliability can affect correlations - implications of Wissler’s (lack of) findings
- “Suppose, 3 balls to be rolled along a well-kept lawn; then the various distances they will go will be almost perfectly
correlated to the various forces with which they were impelled. But let these balls be cast with the same inequalities of
force down a rough mountainside; then the respective distances eventually attained will have but faint correspondence
to the respective original momenta.”
- Spearman highlighted that many investigations of tests reported in the past yielded inconclusive or erroneous results
because of failure to take errors of observations into account
- 1907 – in an article by Krueger and Spearman, the term reliability coefficient was introduced to denote a correlation
reflecting the consistency of a set of measurements
- German term for reliability coefficient is “long ass German word that means correlation”
- 1910 – Spearmen and his student William Brown published a formula for estimating the reliability of a test
lengthened n times
o Spearman-Brown Prophecy Formula
Section 3: Basic Statistics of Measurement
Background
- measurement – assignment of meaningful numbers to phenomena
- psychological measurement – assignment of meaningful numbers to psychological phenomena
- measurement involves scaling and classification
Scales of Measurement
- measurement – assigning meaningful numbers to qualities
- different ways
o 4 general types
1. nominal
2. ordinal
3. interval
4. ratio
o hierarchy – more information in the scales as we progress downwards
Nominal Scales
- numbers are used to classify into categories
- numbers are arbitrary
- substitute for labels
o employed = 1, unemployed = 2
- qualities of numbers as we know them do not apply
o not ranked, added, etc.
Ordinal Scales
- rank order people according to the amount of a characteristic that they have
- typically, most to least with most = 1
- ranks don’t tell us anything about “how much” taller a person is
o intervals between ranks can be different
Results from NYC Marathon
- find the nominal and ordinal scales
o ordinal scale – tells us who is the fastest
o nominal scale – doesn’t tell us who was first, just who they are
Interval Scale
- gives rank like ordinal scales but on a scale with equal units
- provides more information than nominal and ordinal scales
- difference between adjacent units in the same: difference between 50 and 51 is the same as 89 and 90
- most psychological tests are designed to produce interval scales
o personality, attitudes, IQ
- data on this type of scale can be manipulated using common mathematical operations
o additions, subtraction, multiplication, and division - interval scales do not have a true zero point
- no zero IQ (complete lack of intelligence)
o because there is no zero ratios are not meaningful
o can’t day that a score of 100 IQ is twice as smart as score of 50 IQ
Ratio Scales
- have all the properties of interval scale, plus a true zero point
- zero reflects the complete absence of the characteristic/attribute being measured
o km per hour, length, weight, etc
- we know that 100 km/h is twice as fast as 50 km/h and that 500 m is half as long as 1000m
Scales
- built in hierarchy with respect to information
- you will want to use the scale that provides you the most information and that suits your purpose
Distributions
- distribution – simply a set or scores
- can have distributions of scores on personality, IQ, or physical characteristics such as height, weight, etc
- distributions can be represented in a number of different ways
o e.g., tables, graphs
- number of ways to describe distributions
- distributions have a number of characteristics
Characteristics/Ways to Describe Distributions
- skew or symmetry
o normal, negative skew, positive skew
- central tendency – indicating where the centre of the distribution is
- mean – arithmetic average.
o MEAN = SUME OF SCORES/NUMBER OF SCORES
o good for interval and ratio data
o good estimable of population from which a sample is drawn (assuming good job random sampling)
o essential statistic for the calculation of other statistics that are useful in psychological measurement
- can be sensitive to extreme scores – extreme scores can pull the mean in that direction
o particularly problematic when there is small samples
- median – score (or point) that divides a distribution in half
o point at or below were 50% of scores fall when the data are arranged in numerical order
o not susceptible to extreme scores
o when the number of scores is an odd number, the median is the score in the middle (e.g., 9, 8, 7, 6, 5)
o formula to locate median
o MEDIAN = N + ½
o what if there is an even number of values?
3, 5, 7, 11, 14, 15. What is the median?
- mode – most common score (most frequently occurring) in a distribution
o advantage
can be used with nominal, ordinal, interval, and ratio data
o disadvantage
not stable with small samples
More Info
- use lines to locate mean, median, and mode on distribution
- not all measures on central tendency will give the same results in every situation
- skew will produce different results
- uses lines to locate mean, median, and mode on distribution
Variability
- when central tendency isn’t the whole picture
- example – on average, people have 1.90 arms Characteristics/ Ways to Describe Distributions
- variability - distributions have the same mean, median, and mode, but differently variability
- range – distance between smallest and largest score (range = highest score – smallest score)
- standard deviations (s, sd, o) – measure of the average distance that scores vary from the mean of the distribution
o reported alongside mean
- variance – sd squared
o sd is more frequently used because it is more readily interpretable
o same metric as the original data
o squared units of measurement
Pearson Correlation r
- Pearson correlation estimates the degree of linear association between two continuous variables
- theoretical range -1 to 1
- if r = 0 is no linear relation, but there could be a curvilinear one
- squared correlation is an indicator of the proportion of variance that is explained (aka coefficient of determination)
o example
r = .60 then X accounts for .36 or 36% of variance in Y
xy
- back to the theoretical range: why is it theoretical?
o range could be restricted due to different factors/conditions
factors that affexy r
• relation between X and Y is nonlinear
• variance of either variable is narrow (restricted)
• shapes of the frequency distribution are different (e.g., one is positively skewed and the
other negatively skewed)
• reliability of X or Y scores are low
Correlation
- other types of correlations (when variables aren’t continuous)
o point biserial – when one of the variables is dichotomous (e.g., gender)
o tetrachroic (Phi coefficient) – when both variables are dichotomous
o Spearman rank correlation – when data are rank ordered variables (ordinal data)
- “Pearson r formula can be applied in these cases, will

More
Less
Related notes for PSYC 3250