242-15 Reliability of a Measure (Ch. 5: p. 129-135)
● Define and describe the concept of reliability of a measure, including its relationship to the term
random measurement error.
Reliability - consistent results when the measurement is repeated under the identical
Small random error = more reliable measure. If a measurement is reliable, consistent
results ensure that any small, random errors that could be occurring while the measure
is being taken are minimal.
● Describe three types of reliability: test-retest reliability, inter-rater reliability, and internal
reliability. Identify when it is appropriate to use each of these types of reliability.
Test-Retest Reliability - consistent results after every remeasure
Interrater Reliability - consistent results no matter who is observing
Internal Reliability - consistent results no matter how you ask
● Explain how to visually present reliability data with scatterplots, and how to quantify reliability
data with a Pearson r correlation coefficient or percent agreement. Explain how to apply this
general approach to each of the three types of reliability:
● test-retest reliability
■ r ≥ .50 → good test-retest reliability (seen on scatterplot w/ Pearson
■ Correlate 1st measurements with 2nd measurements.
● inter-rater reliability
■ r ≥ .70 → good interrater reliability (seen on two scatterplots w/ two
■ Using percent agreement for categorical data → 70-80% is good
■ Correlate one observer’s measurements with other’s.
● internal reliability with Cronbach's alpha
■ Alpha is calculated from mean of all inter-item correlations which
you’ll see from their questionnaire responses.
■ ⍺ ≥ 0.70 → good internal reliability
● Explain two potential problems when evaluating test-retest reliability and how they would distort
1. Respondents remembering their earlier responses (inflates r).
2. Respondents changing between their administrations (deflates r). ● Be able to identify and interpret evidence for reliability in journal articles.
**eek, we’ll see about that**
● Define and describe the concept of accuracy of a measure, including its relationship to the term
measurement bias, for which claims accuracy is most important, and how it can be assessed.
Briefly explain how accuracy might apply to psychological measures (e.g., standardizing
measures that do not use standard units).
Accuracy - produces results that agree with a known standard.
Measurement Bias - the average measured value systematically differs from the true
value because of ‘bias’ error.
Smaller measurement bias = more accurate measure. This is because, the less bias the
researcher or subject has, the more likely the value can turn out to be close to the true
Can be assessed by: measuring a known standard with the instrument and comparing
Accuracy is not relevant when measures are NOT using standard units because the
values measured CANNOT be compared to a standardized value. (AKA no reason for
● Differentiate between reliability and accuracy of a measure (e.g., how a reliable instrument can
still be inaccurate due to measurement bias), and explain why reliable measures are essential.
Reliability is referring to how close one’s measured values are to each other while
accuracy is referring to how close one’s measured values are to a true, known, standard
**A measure should always be reliable. W/o reliability, single measurements will vary
unpredictably from true values even if the measurement is accurate.**
242-16 Construct Validity (Ch. 5: p. 136-150)
● Define and describe the concept of construct validity and why it is important. Differentiate
construct validity from reliability and accuracy, using examples such as phrenology.
Construct Validity - is the operational definition a good measure/manipulation of the
interested variable. Reliability & Accuracy are properties of the measure itself while construct validity
depends on the conceptual variable.
Need to take into consideration both reliability and accuracy before construct validity.
❖ Example: Phrenology ~ studying one’s personalities based off head
■ Valid measure of head size
■ Invalid measure of intelligence
● Define and describe two subjective ways of assessing construct validity: face validity and
Face Validity - it looks like a plausible measure of the conceptual variable.
Content Validity - it includes all the parts that the theory says it should contain.
● Define criterion validity and explain how it can be empirically assessed using correlation
coefficient evidence and known-groups evidence for criterion validity. Distinguish between
concurrent and predictive criterion validity.
Criterion Validity - whether the measure is related to a relevant, concrete outcome.
➔ Concurrent Validity - the data collected at the same time predict each other’s
➔ Predictive Validity - the measure predicts a future outcome.
Empirically Assessed by using:
a. Correlation Coefficients - when an outcome is quantitative, scatterplots and
correlation coefficients can be used to tell whether the two variables measured
b. Known-Groups - when the outcome is categorical, tables or bar graphs can be
used to distinguish if there is a correlation between variables.
● Explain how convergent validity and discriminant validity can be used together to establish
construct validity of a measure.
Convergent Validity - the measure correlates strongly with other validated measures of
the same construct.
Discriminant Validity - the measure correlates less strongly with validated measures of
different constructs. ● Describe the relationship between reliability and construct validity of a measure. Explain and give
examples of how a measure can be reliable yet invalid for a particular conceptual variable, and
explain why it doesn't make sense to talk about construct validity of an unreliable measure.
A measure can be reliable but not valid.
Ex: Phrenology and measuring head size to determine intelligence level. This is reliable
measure for head size but invalid for intelligence.
NOTE: A measure cannot be valid without being reliable because if a measure isn’t
correlated with itself (reliable), it cannot be correlated with anything else (validity).
● Be able to identify and interpret evidence for construct validity in empirical journal articles.
**eh, we’ll see**
242-17 Sampling and External Validity (Ch. 7: p.
● Define and describe the concept of external validity and why it is important. Define and
differentiate between two types of external validity: population validity and ecological validity.
External Validity = Generalization
Scientists often wish to generalize findings to other people and contexts (ex. External
Generalizing to other people not included in the study.
Population Validity - do the research findings obtained from a sample apply to the
population of interest?
Generalizing to other contexts and situations beyond those studied.
Ecological Validity - do the research findings reflect what people do in the real world?
May be influenced by subject reactivity, the research setting, or how variables are
● Define the terms population of interest, census, and sample.
Population of Interest - all individuals to whom we desire to generalize the findings of
a research study
May be defined in many ways All adult humans
All native english speakers
All children in day care
Children in daycare in Chicago
Census - includes every member of a population
Sample - smaller subgroup of subjects chosen from the population
● Differentiate between a representative sample and a biased (unrepresentative) sample, and
relate these terms to the concept of population validity. Explain why a given sampling technique
would be likely to result in a representative or biased sample.
Representative Sample - closely match the various characteristics of the population of
Biased Sample - occur when sample characteristics don’t match those of the
Usually stems from nonrandom sampling procedures
Attempting to generalize outside a biased sample may be misleading or inaccurate.
● Describe the two major steps in the selection process for acquiring a final sample (for a study with
informed consent). Define the concepts of sampling bias and non-response or volunteer bias
and explain how they can affect the final sample and population validity of a research study.
Two Major Steps:
1. Recruitment (risk for sampling bias)
● Only part of target population will be accessible (sampling frame)
● May not reflect population of interest (convenience sampling)
● Reach potential participants in some way (contacted sample) so they are aware
of possibility to participate.
2. Enrollment (risk for non-responsive or volunteer bias)
● inclusion/exclusion criteria
● Eligible participants choose to participate or decline to participate
● Self-selection for participation
● Define and differentiate between probability or random sampling and a nonrandom sampling.
Random Sampling - every member of the population of interest has an equal chance
of being chosen. Usually generates a representative sample, so findings can be generalized to the
population from which it was drawn.
Nonrandom Sampling - every member of the population of interest does not have the
same chance of being chosen.
Will be a biased or unrepresentative sample, so must be cautious in generalizing
findings to the intended population.
● Explain how to employ various probability sampling techniques (simple random sampling,
proportionate stratified random s