Textbook Notes (280,000)
CA (160,000)
York (10,000)
PSYC (4,000)
Chapter 6

PSYC 3525 Chapter Notes - Chapter 6: Criterion Validity, Convergent Validity, Concurrent Validity


Department
Psychology
Course Code
PSYC 3525
Professor
Josee Rivest
Chapter
6

Page:
of 5
PSYC 2520: Introduction to Experimental Psychology
Beginning Behavioral Research: A Conceptual Primer (7th Ed. 2012) Rosnow & Rosenthal
Chapter 6: Reliability and Validity in Measurement and Research
Construct validity-- The degree to which the conceptualization of what is being measured or experimentally
manipulated is what is claimed, such as the constructs that are measured by psychological tests or that
serve as a link between independent and dependent variables
Content validity-- The adequate sampling of the relevant material or content that a test purports to
measure
Convergent or discriminant validity-- The grounds established for a construct based on the convergence of
related tests or behavior (convergent validity) and the distinctiveness of unrelated tests or behavior
(discriminant validity)
Criterion validity-- The degree to which a test or questionnaire is correlated with outcome criteria in the
present (its concurrent validity) or the future (its predictive validity)
External validity-- The degree of generalizability of a relationship over different people, settings,
manipulations (or treatments), and research outcomes
Face validity-- The degree to which a test or other instrument "looks as if" it is measuring something
relevant
Researcher expectancy
Participant expectancy
Participant selection
Loss of participants
Between Subject Variables
Maturation and historical factors
Habituation and fatigue
Statistical regression
Within Subject Variables
Threats to internal validity:
Constancy
Counter-balancing: Latin Square
Systematic variation
Random variation
Controls for internal validity:
Internal validity-- The soundness of statements about whether one variable is the cause of a particular
outcome, especially the ability to rule out plausible rival hypotheses
Statistical-conclusion validity-- The accuracy of drawing certain statistical conclusions, such as an estimation
of the magnitude of the relationship between an independent and a dependent variable (a statistical
relationship that is called the effect size) or an estimation of the degree of statistical significance of a
particular statistical test
Validity-- How well the measure or design does what it's supposed to do
Implies consistency, stability, and dependability
Alternate-form reliability-- The degree of relatedness of different forms of the same test
Internal-consistency reliability-- The overall degree of relatedness of all items in a test or all raters in a
Reliability-- The extent to which observations or measures are consistent or stable
What is the difference between reliability and validity?
Ch. 6 - Reliability and Validity
Tuesday, October 30, 2012
1:03 AM
Textbook Notes Page 1
Internal-consistency reliability-- The overall degree of relatedness of all items in a test or all raters in a
judgment study (also called reliability of components)
Item-to-item reliability-- The reliability of any single item on average (analogous to judge-to-judge reliability,
which is the reliability of any single judge on average)
Test-retest reliability-- The degree of temporal stability (relatedness) of a measuring instrument or test, or
the characteristic it is designed to evaluate, from one administration to another, also called retest reliability
Random error-- the name for chance fluctuations, or haphazard errors
The greater these fluctuations, the less reliable the scores are
Raw scores are true scores plus random errors that push the raw scores up or down around the true scores
Systematic error-- the name for fluctuations that are not random but are slanted in a particular direction (also
called bias)
Random errors are likely to cancel one another, on average, over a great number of repeated measurements
(i.e., they are likely to have an average of about 0)
Systematic errors do not cancer one another and do affect all measurements in roughly the same way
What are random and systematic errors?
Test-retest reliability-- The degree of temporal stability (relatedness) of a measuring instrument or test, or the
characteristic it is designed to evaluate, from one administration to another, also called retest reliability
Reports of test-retest reliability of an instrument ordinarily indicate not only the interval over which the
retesting was done, but also the nature of the sample on which the test-retest reliability is based
A common concern when people take the same test twice is that the test-retest r may be artificially inflated
because of their familiarity with the test; this can be prevented by creating an alternate-form of the test
Alternate-form reliability-- The degree of relatedness of different forms of the same test
What is the purpose of retest and alternate-form reliability?
Internal-consistency reliability (R)-- the overall degree of relatedness of all items in a test or all raters in a
judgment study
Spearman-Brown formula--  


where is the number of items in the test and  is the
average intercorrelation of the items
K-R 20-- useful if items are scored dichotomously (i.e., 1 for correct and 0 for incorrect)
Cronbach's alpha-- not restricted to dichotomously scored items
Ways to estimate internal-consistency reliability:
The internal-consistency reliability will increase with increased test length as long as the items being added are
relevant and are not less reliable than the items already in the test
The mean item-to-item correlation (mean r between all items or judges)
Item-to-item reliability-- The reliability of any single item on average (analogous to judge-to-judge reliability,
which is the reliability of any single judge on average)
What is internal-consistency reliability, and how is it increased?
The optimal number of items on a test or questionnaire depends on the context in which your instrument is to
be used and the objective of the research
Relevant measures may be found in the Directory of Unpublished Experimental Mental Measures (Goldman &
Mitchell, 2003)
Internal-consistency reliability is usually expected to be higher than test-retest reliability, unless the test-retest
intervals are short
Personality tests such as the Rorschach and MMPI have relatively low criterion validity (r= 0.29, r= 0.30) for
predicting psychopathology, and thus caution should be exercised when using these tests
What are acceptable test-retest and internal-consistency reliabilities?
How is the reliability of judges measured?
Textbook Notes Page 2
Judge-to-judge reliability-- the reliability of any single judge on average
Spearman-Brown formula for internal-consistency reliability--  



where is the number of
judges and  is the average judge-to-judge reliability
How is the reliability of judges measured?
Also, the extent to which a causal relationship holds across variations in persons, settings, treatments, and
outcomes
Synonymous with "generalizability" or "causal generalizability"
External validity-- The extent to which the results of research or testing can be generalized beyond the sample
that generated the results to other individuals or situations
Replication-- the repeatability of observations
All replications are relative replications because the same experiment can never be "exactly" repeated
The issue is whether the size of the effect (effect size) of an independent variable (X) on a dependent variable (Y)
is similar in the original and the replication study
Replications should be independent of one another (i.e., not all conducted by the same researcher)
Variables that were not in the experiment (variations in persons, settings, and treatments)
1)
Variables that were in the experiment (operationalizing the variable of interest too narrowly, using a highly
specialized group of research participants)
2)
Threats to external validity fall into 2 broad categories:
Because it would be impossible to rule out every potential threat to external validity, researchers must be
sensitive to the limitations of their study designs and must not make false or imprudent causal generalizations
How is reliability related to replication and external validity?
Content validity-- the test or questionnaire items represent the kinds of materials (or content areas) they
are supposed to represent
1)
In assessing criterion validity, researchers select the most sensitive and meaningful criterion in the
present (concurrent validity) or future (predictive validity) and then statistically correlate the
participants' performance on the test or questionnaire with that criterion
Criteria must often be evaluated with respect to other criteria, although there are no firm rules as to
what constitutes an "ultimate criterion"
Criterion validity-- the degree to which the test or questionnaire is correlated with one or more outcome
criteria
2)
Construct validity
3)
Validity is typically the most important criterion in instrument (e.g., test) construction and typically involves
accumulating evidence in 3 categories:
Test users know the capabilities and limitations of each instrument before using it
Test takers are not misled or their time and effort wasted when they are administered these instruments
Test developers are expected to provide this information so that
Face validity-- whether the instrument appears on the surface to be measuring something relevant
How are content and criterion validity defined?
Construct validity-- has to do with what a test really does assess
Construct validity can be assessed by testing for convergent validity and discriminant validity
Convergent or discriminant validity-- The grounds established for a construct based on the convergence of
related tests or behavior (convergent validity) and the distinctiveness of unrelated tests or behavior
(discriminant validity)
How is construct validity assessed in test development?
One of the more common threats to construct validity is vagueness in defining or operationalizing the concepts
or variables of interest
How is construct validity relevant to experimental design?
Textbook Notes Page 3