false

Textbook Notes
(368,450)

Canada
(161,883)

University of Guelph
(12,243)

Psychology
(3,337)

PSYC 3380
(4)

Jeffrey Spence
(4)

Chapter

Unlock Document

Psychology

PSYC 3380

Jeffrey Spence

Winter

Description

Week One 1/16/2013 2:24:00 PM
Chapter One pp. 3-13
statistics reform: incorporates effect size estimation, the reporting of
confidence intervals, and synthesis of results from replications in a meta-
analysis
some critical problems with behavioral science research are:
- most articles published in our research literature are never cited by other
authors and thus, by definition, have little or no impact on the field
- there are problems with the quality of many published studies in terms of
their actual scientific contribution, how the data were analyzed, or how the
results were interpreted
- there is a disconnect across many behavioral science disciplines between
the conduct of research on the one hand and the application of those results
on the other
Chapter Two pp. 15-35
there are three healthy aspects of the research tradition in the behavioral
sciences:
anchor to reality
- sometimes students new to the behavioral sciences are surprised at the
prominent role accorded to research in academic programs
- there are some potential advantages to possessing the ability to think
critically about how evidence is collected and evaluated that is afforded by a
research-based education (such as having a skeptical attitude about a
proposed medical treatment, the need for evidence reduces extreme claims
made by practitioners, having realistic beliefs)
rise of meta-analysis and meta-analytic thinking
- meta-analysis: a set of statistical techniques for summarizing results
collected across different studies in the same general area; a type of
secondary analysis where findings from primary studies are the unit of
analysis; the central tendency and variability of effect sizes are more
relevant than the statistical significance of each individual study
- meta-analytic thinking includes: reporting of results should be made so
that they can easily be incorporated into a future meta-analysis (including
the reporting of sufficient summary statistics so that effect sizes can be calculated), a researcher should view their own individual study as making at
best a modest contribution to a research literature, an accurate appreciation
of the results of previous studies is essential (especially in terms of effect
sizes), and retrospective interpretation of new results (once collected) are
called for via direct comparison with previous effect sizes
waxed exceeding mighty
- information explosion is fueled by computer technology which has made
possible electronic publication and distribution over the internet
- open-access journals: refereed electronic journals that can be accessed
without cost and are generally free of many copyright and licensing
restrictions
- self-archiving research repositories: electronic databases where works by
researchers in a common area are stored for later access by others
- information fatigue: refers to the problem of managing an exponentially
growing amount of information (for example, the total number of scientific
journals is now so great that most libraries are unable to physically store the
printed versions of them all, much less afford the total cost of institutional
subscriptions)
- impact factor: a descriptive, quantitative measure of overall journal
quality; it is a bibliometric index published annually by the Institute for
Scientific Information (ISI) which analyzes citations in over 14, 000 scholarly
journals – it reflects the number of times the typical article in a journal has
been cited in scientific literature; the IF is subject to bias, it is based on
mainly English language scientific journals (which only consist of about one
quarter of peer-reviewed journals worldwide), online availability of articles
(those with full-text availability are cited more often than those available on
a more restricted basis), the IF is computed for a whole journal, but citations
generally refer to articles, not journals (thus it is possible that a relatively
small number of frequently cited articles are responsible for most of the
value of IF for a whole journal), the tendency for authors to cite their own
publications can inflate IF; a more controversial use of the IF is as a
measure of the quality of work of individual scholars or entire academic
unites
there are four negative aspects to our research literature:
1. skewness and waste - the rejection rate for journals ranges from 80-90%
- a typical journal article receives few citations and has relatively few
interested readers
- only a small number of published articles are both widely read and cited
(the 80/20 rule: about 20% of published articles generate about 80% of the
citations)
- the uneven distribution of publications and citations in science supports the
elite in the scientific realm through new discoveries or the development of
new paradigms (a shared set of theoretical structures, methods, and
definitions that supports the essential activity of puzzle solving, the posing
and working out of problems under the paradigm)
2. wide gap between research and policy or practice
- example: the formation of education policy is infrequently informed by the
results of education research
- specifically within education, there are cases where education research
follows policies (such as learning disabilities) – in the U.S. a learning
disability is defined in federal law based on an IQ-achievement discrepancy
model, in which children are identified as learning disabled when their IQ
scores are in the normal range but their scores on scholastic achievement
tests are much lower – they are entitled to remedial services under federal
law but children who have a low IQ score and low achievement test score
are considered slow learners and are not entitled to remedial services
because it is believed that they will not benefit from it – there is little
evidence that IQ scores and achievement test scores measure two different
things
3. lack of relevance for practitioners
- researchers can communicate poorly with practitioners when reporting
their findings using language that is pedantic or unnecessarily technical
(using excessively complicated statistical techniques and the description of
results solely in terms of their statistical significance)
- clinical psychology practitioners have said that research topics are
sometimes too narrow or specific to be of much practical value
4. incorrect statistical results
- some of the errors in reported statistical results was due to typographical
errors in printed values of test statistics and some can be due to errors in
the reporting of summary statistics the next three problems are catastrophic concerning the scientific merit of
our research literature; they are also interrelated in that weakness in one
area negatively affects quality in other areas (these problems afflict “soft”
research areas more than “hard” research areas):
1. little real contribution to knowledge
- only about 10% of all journal articles present new enlightening information
2. lack of cumulative knowledge
- theoretical cumulativeness: empirical and theoretical structures build on
one another in a way that permits results of current studies to extend earlier
work
- the number of true scientific breakthroughs in psychology over the last few
decades is very modest
3. paucity of replication
- replication is paid scant attention in the behavioral science research
literature
there are three possible reasons for the overall poor state of behavioral
science research:
1. soft science is hard
- soft science (non-experimental research) is more difficult than hard science
(experimental) because in some cases, for ethical reasons or even human
limitations, it is not possible to randomly assign individuals to certain groups
- human behavior may be much more subject to idiographic factors (specific
to individual cases; concern discrete or unique facts or events that vary
across both cases and time – experiences and environments) than to
nomothetic factors (genetics and common neural organization) than physical
phenomena (the general laws or principles that apply to every case and
work the same way over time) – if this is true, then there is less potential for
prediction
- context effects tend to be relatively strong for many aspects of human
behavior (how behavior is expressed often depends on the particular familial
or social context) – also known as interaction effects; tend to reduce the
chance that a result will replicate across different situations, samples or
times
- our practices concerning measurement in the behavioral sciences are often
too poor, especially when we try to assess the degree of a hypothetical
construct - the soft behavioral sciences lack a true paradigm which is necessary for
theoretical cumulativeness – the use of common tools is only a small part of
a paradigm; there is little agreement in the soft behavioral sciences about
what main problems are and how to study them
2. overreliance on statistical tests
- not only do we rely on statistical significance tests too much, but we also
misinterpret the outcomes
- it has also been said that research progress is hindered by our
dysfunctional preoccupation with statistical tests
3. economy of publish or perish
- least publishable unit (LPU): refers to the smallest amount of ideas or
data that could generate a journal article – used in a sarcastic way to
describe the pursuit of the greatest quantity of publications at the expense
of quality
- sometimes the publish or perish economy for academicians is rationalized
by the thought that active researchers make better teachers, however the
correlation between these two domains (research productivity and teaching
effectiveness) among professors is zero.
Chapter Four pp. 73-76
comparative studies: at least two different groups or conditions are
compared on an outcome (dependent) variable
quantitative research: there is an emphasis on 1) classification and
counting of behavior, 2) analysis of numerical scores with formal statistical
methods and 3) role of the researcher as an impassive, objective observer
qualitative research: the researcher is often the main data-gathering
instrument through immersion in the subject matter, such as in participant
observation
there are three basic steps involved in connecting your research question
with a possible design and there are three possible research questions:
1. descriptive: involves the simple description of a sample of cases on a set
of variables of interest; it is relatively rare when research questions are
solely descriptive
2. relational: concerns the covariance between variables of interest; more
common; typically about the direction and degree of covariance 3. causal: concerns how one or more independent variables affects one or
more dependent variables – better to evaluate these questions with multiple
groups, some of which are exposed to an intervention but others are not or
a single sample that is measured across multiple conditions, such as before-
and-after treatment (both comparative studies)
if assignment to groups or conditions is random, then the design is
experimental; if any other method is used and a treatment effect is
evaluated, the design is quasi-experimental
if one-to-one matching is used to pair cases across treatment and control
groups, then part of the design has a within-subject component; otherwise
the design is purely between-subject if each case in every group is tested
only once
testing each case on multiple occasions also implies a design with a within-
subject component, in this case a repeated measures factor
Chapter Four pp. 92-116
in quasi-experimental designs, cases are assigned to treatment or control
groups using some method other than random assignment – this implies
that the groups may not be equivalent before the start of the treatment
nonequivalent-group designs: the treatment and control groups are
intact, or already formed; these groups may be self-selected; ideally they
should be as similar as possible and the choice of group that receives the
treatment is made at random
- the most basic nonequivalent-group design has two groups measured at
posttest only (posttest only design); the absence of pretests makes it
extremely difficult to separate treatment effects from initial group
differences, therefore, the internal validity of this design is threatened by all
forms of selection-related bias
- pretest-posttest design: tests before and after treatment; still subject to
many selection-related threats even if the tests are identical (ex. Selection-
regression bias if cases in one group were chosen due to extreme scores,
selection-maturation concerns the possibility that the treatment and control
groups are changing naturally at different rates in a way that mimics a
treatment effect, selection-history is the possibility that events occurring
between the pretest and posttest differentially affected the treatment and control groups), all forms of internal validity threats for multiple-group
studies apply to this design as well
- ANCOVA: a covariate analysis that statistically controls for group
differences on the pretest; its use to adjust group mean differences on the
dependent variable for group differences on pretests in nonequivalent-group
designs is problematic because unless the pretests measure all relevant
dimensions along which intact groups differ that are also confounded with
treatment, then any statistical correction may be inaccurate
- an alternative to running an ANCOVA is using an MR (multiple
regression) – any type of ANOVA is just a restricted form of MR (one of
these restrictions is the homogeneity of regression assumption which can be
relaxed in an MR because it is possible to represent in regression equations
the inequality of slopes of within-group regression lines) – in MR, a standard
ANCOVA is conducted by entering group membership and the covariate as
the two predictors of the outcome variable; the specific form of this equation
is:
Y = B(sub1)X + B(sub2)Z + A
- where Y is the predicted score on the outcome variable, B(sub1) is the
unstandardized regression coefficient (weight) for the difference between
treatment and control (X), B(sub2) is the unstandardized coefficient for the
covariate (Z), and A is the intercept (constant) term of the equation
- B(sub1) equals the average difference between the treatment and control
groups adjusted for the covariate (the predicted mean difference) –
interpretation of this predicted mean difference assumes homogeneity of
regression
- moderated multiple regression: the term moderated refers to the
inclusion in the equation of terms that represent an interaction effect for this
equation, add B(sub3)XZ – this is the product of X (group membership
scores) and Z (covariate)
- propensity score analysis (PSA): another alternative to an ANCOVA,
more complex; important method for statistically matching cases from
nonequivalent groups across multiple pretests; the first phase in a PSA are
propensity scores (the probability of belonging to the treatment or control
group, given the pattern of scores across the pretests (these scores can be
estimated using logistic regression – where the pretests are treated as a
dichotomous variable to the selection of group being treatment or control) – each case’s scores are reduced to a single propensity score; the second
phase consists of standard matching of treatment cases with non-treatment
cases based on pretest scores
- hidden bias: the degree of undetected confounding that would be needed
to appreciably change study outcome
- double pretest design: the administration of the same measure on 3
occasions, twice before treatment, and once after; accounts for selection-
maturation bias (can tests group rate change between two pretests and see
if both groups change at the same rate/same direction and compare to
posttest to see if there is a difference due to treatment); internal validity is
still susceptible to selection-history, selection-testing, and selection-
instrumentation bias
regression-discontinuity designs: cases are assigned to conditions based
on a cutoff score from an assignment variable, which can be any variable
measured before treatment – there is no requirement that the assignment
variable should predict the outcome variable; the cutting score is often
established based on merit or need; implies that groups are not equivalent
before treatment begins; because the selection process (how cases wind up
in treatment or control groups) is totally known, the internal validity of this
design is much closer to that of experimental designs than that of
nonequivalent-group designs
- a type of pretest/posttest design where participants are measured before
treatment and after
- the selection process permits statistical control of the assignment variable
in regression analysis
- when looking at a scatterplot of treatment vs. control groups, a treatment
effect would show a “break” in the regression line right near the cutoff
score; the increase in the treatment group would be constant
- the magnitude of the discontinuity between the regression lines at the
cutting score estimates the treatment effect
- there is no selection bias, differential maturation or history bias, regression
artifacts bias, or measurement error in the assignment variable within a
regression-discontinuity design
- assuming linearity and homogeneity of regression, the predictors of the
dependent variable are (1) the dichotomous variable of treatment vs. control
(X) and (2) the difference between the score on the assignment variable (O sub a) and the cutting score (C) for each case – this subtraction forces the
computer to estimate the treatment effect at the cutting score (which is also
the point when the groups are the most similar); the equation for this is:
Y = B(sub 1)X + B(sub 2)(O sub a –C) + A
- B(sub 2) estimates the treatment effect at the cutoff point
- if the assumption of either linearity or homogeneity is not met, the results
may not be correct using this equation
one-shot case study: a single group is measured once after an
intervention; no control group; most primitive of quasi-experimental designs
one group pretests-posttest design: also no control group, pretests
before treatment and posttest after treatment
about the only way to strengthen the internal validity of a study without a
control group is to use multiple pretests or posttests
removed-treatment design: an intervention is introduced and then later
removed
repeated-treatment design: an intervention is introduced, removed, and
then re-introduced; threats to the attribution of changes to treatment in a
repeated treatment design would have to come and go on the same
schedule as the introduction and removal of treatment
a time series is a large number of observations made on a singular variable
over time
interrupted time-series design: the goal is to determine whether some
discrete event – an interruption – affected subsequent observations in the
series
- the basic aims of a time series analysis are threefold:
1. statistically model the nature of the time series before the intervention,
taking account of seasonal variation
2. determine whether the intervention had any appreciable impact on the
series, and if so then
3. statistically model the intervention effect (concerns whether the
intervention had an immediate or delayed impact, whether this effect was
persistent or decayed over time, and whether it altered the intercept or
slope of the time series)
- autoregressive integrative moving average model (ARIMA): uses
lags and shifts in a time series to uncover patterns, such as seasonal trends
or various kinds of intervention effects; it is also used to develop forecasting models for a single time series or even multiple time series; an advanced
statistical technique for time series analysis that may require 50
observations or so
case-control/case-referent/case-history/retrospective design: a type
of comparative design but not for directly evaluating treatment effects;
cases are selected on the basis of an outcome variable that is typically
dichotomous
- it is common to match cases across the two groups on relevant
background variables
- case-control designs are appropriate in situations where randomized
longitudinal clinical trials are impossible or when infrequent outcomes are
studied
- threats to internal validity include selection bias, differential attrition of
cases across patient and non-patient groups
- the real value of case-control studies comes from review of replicated
studies (from a meta-analytic perspective)
nonexperimental studies only have statistical control when possible
confounding variables are included in the analysis to support causal
inferences; the accuracy of statistical control depends on the researcher’s
ability to identify and measure potential confounders
there are statistical techniques that estimate direct and indirect causal
effects among variables that were concurrently measured such as:
- path analysis: an older technique in SEM; estimates causal effects for
observed variables, modern versions can also estimate causal effects for
latent variables (constructs); can be interpreted as evidence for causality
only if we can assume that the model is correct however, researchers rarely
conduct SEM analyses in a context where the true causal model is already
known
equivalent models: explain the data just as well as the preferred model
but do so with a diff configuration of hypothesized effects among the same
variables; it offers a competing account of the data
it is only with the accumulation of the following types of evidence that the
results of nonexperimental studies may eventually indicate causality:
1. replication of the study across independent samples
2. elimination of plausible equivalent models 3. the collection of corroborating evidence from experimental studies of
variables in the model that are manipulable
4. the accurate prediction of effects of interventions Week 2 1/16/2013 2:24:00 PM
Pg. 39-72
Chapter 3
Trinity Overview
Design – internal validity, external validity
Measurement – construct validity
Analysis – conclusion validity
Design
5 Structural Elements of an empirical study
o 1) Samples (groups)
o 2) Conditions (treatment or control)
o 3) Method of assignment to groups or conditions (eg.
Random)
o 4) Observations
o 5) Time, or the schedule for measurement or when treatment
begins or ends
Random assignment = experimental design
Quasi-experimental design
o 1) Cases are divided into groups that do/do not receive
treatment using any other method OR
o 2) There is no control group but there is a treatment group
o Difficult to reject alternative explanations
Cause-probing designs – inferences about cause-effect relations are
of paramount interests (experimental and quasi-experimental)
Non-experimental designs
o Presumed causes and effects may be identified and measured
o Difficult to make plausible causal inferences
o Presumed causes are not directly manipulated
3 Types of designs are not mutually exclusive
Best Possible Design
o 1) Theory-grounded b/c theoretical expectations are directly
represented in the design
o 2) Situational in that the design reflects the specific setting of
the investigation o 3) Feasible in that the sequence and timing of events, such as
measurement, is carefully planned
o 4) Redundant b/c the design allows for flexibility to deal with
unanticipated problems without invalidating the entire study
o 5) Efficient in that the overall design is as simple as possible,
given the goals of the study
Hypotheses express questions/statements about the existence,
direction, and degree of the relation/covariance b/w 2 or more
variables
Designs provide context and control extraneous variables
o Nuisance variables – introduce irrelevant or error variance
that reduces measurement precision
Controlled through a measurement plan that specifies
proper testing environments, tests and examiner
qualifications
o Confounding variables (lurking variables, confounders) – two
variables are cofounded if their effects on the dependent
variables cannot be distinguished from each other
Design must also generally guarantee the independence of
observations – the score of one case does not influence the score of
another
o Assumed in many statistical techniques that scores are
independent (eg. Analysis of variance, ANOVA)
o Critical assumption b/c the results of the analysis could be
inaccurate if the scores are not independent
o No statistical fix or adjustment for lack of independence
Local molar causal validity (internal validity) – emphasizes that
o 1) Any causal conclusions may be limited to the particular
samples, treatments, outcomes, and settings in a particular
investigation
o 2) Treatment programs are often complex packages of
different elements, all of are simultaneously tested in the
study
Three general conditions must be met before one can reasonably
infer a cause-effect relation o 1) Temporal precedence: the presumed cause must occur
before the presumed effect
o 2) Association: There is observed covariation – variation in
the presumed cause must be related to that in the presumed
effect
o 3) Isolation: There are no other plausible alternative
explanations of the covariation b/w the presumed cause and
presumed effect
Temporal precedence is established in experimental or quasi-
experimental designs when treatment begins before outcome is
measured
o Can be ambiguous in non-experimental designs
Measurement
3 Purposes
o 1) The identification and definition of variables of interest
o 2) An operational definition, which specifies a set of methods
or operations that permit the quantification of the construct
o 3) Scores, which are the input for the analysis – relatively
free of random error
Construct validity is the main focus of measurement
o Concerns whether the scores reflect the variables of interest,
or what the researcher intended to measure
Requirement for construct validity is score reliability
Analysis
3 Main goals
o 1) Estimating covariances b/w variables of interest,
controlling for the influence of other relevant, measured
variables
o 2) Estimating the degree of sampling error associated with
this covariance
o 3) Evaluating the hypotheses in light of the results
Experimental or quasi-experimental designs, the covariance to be
estimated is b/w the independent variable of treatment and the
dependent variable, controlling for the effects of other independent o Point estimate of a population parameter with a single
numerical value
Interval estimation: estimation of the degree of sampling error
associated with the covariance
o Involves the construction of a confidence interval about a
point estimate
Confidence interval: range of values that may include that of the
population covariance within a specified level of uncertainty
Conclusion Validity is associated mainly with the analysis
o 1) Whether the correct method of analysis was used
o 2) Whether the value of the estimated covariance
approximates that of the corresponding population value
o Might indicate whether a treatment program has been
properly implemented
Five Fundamental Things To Know About ANOVAs
o 1) Don’t just use it to conduct F-tests
o 2) Awkward to include a continuous variable
o 3) Restricted case of multiple regression
o 4) It permits the distinction b/w random effects (randomly
selected by researcher) and fixed effects (intentionally
selected by the researcher)
o 5) The statistical assumptions of ANOVA are critical
(homogeneity of variance)
Analysis of Covariance (ANCOVA)
o A covariate is a variable that predicts outcome but is ideally
unrelated to the independent variable
o Variance explained by a continuous covariate is statistically
removed – reducing error variance
o Works best in experimental designs where groups were
formed by random assignment, and it is critical to meet its
statistical requirements (assumptions)
1) Scores on the covariate are highly reliable
2) The relation b/w the covariate and the outcome
variable is linear for all groups
3) Homogeneity of regression Internal Validity
The requirement that there should be no other plausible explanation
of the results other than the presumed causes measured in your
study
Addressed through control of extraneous variables
o 1) Direct manipulation
o 2) Random assignment (randomization)
o 3) Elimination or inclusion of extraneous variables
o 4) Statistical control (covariate analysis)
o 5) Through rational argument
o 6) Analyze reliable scores
In behavioural science direct manipulation is usually accomplished
in experimental designs through the random assignment of cases to
groups or levels of independent variables that represent conditions,
such as treatment versus control
Randomization equates groups
o Failure of randomization: when unequal groups are formed
Elimination of an extraneous variable involves converting it to a
constant
o Inclusion of an extraneous variable involves the direct
measurement of such a variable and its addition as a distinct
factor in the design
Statistical control
o An extraneous variable is directly measured, but it is not
explicitly represented as a factor in the design
Rational arguments – made by researchers

More
Less
Related notes for PSYC 3380

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.