Introduction to Statistical Analysis for Social Work | Midterm Definitions
Chapter 1 – Introduction
Binary Variable: a dichotomous variable whose values are 0 (reflecting absence
of any quantity of the variable) and 1 (reflecting presence of the variable).
Bivariate Analysis: a statistical analysis of the relationship between two
Causal Relationship: a relationship between two variables for which we can say
that the presence or absence of one variable determines the presence or
absence of the other or that values of one variable result in specific values of the
Conceptualization: the first step in the measurement process, in which the
researcher selects the variables to be measured; delineating the exact meaning
of the independent and dependent variables.
Confounding Variable: variables operating in a specific situation in such a way
that their effects cannot be separated; they occur when the effects of an
extraneous variable cannot be separated from the effects of the dependent
variable; the effects of the extraneous variable thus confound the interpretation of
Constant: a characteristic that has the same value for all individuals in a
Continuous Variable: variable that may theoretically assume any value between
two points on the measurement scale; it can thus have an infinite number of
possible values between those points.
Correlated Variables: variables whose values are associated in a systematic
way with values in the others.
Correlation Analysis: statistical methods that allow us to discover, describe, and
measure the strength and direction of associations between and among
variables; include the various techniques of computing correlation coefficients
and regression analyses.
Criterion Variable: the variable whose values are predicted from measurements
of the predictor variable; another term for outcome variable.
Data: the numbers or scores generated by a research study; the word data is
plural. Dependent Variable: the variable that we do not directly introduce or
manipulate; after the different levels of the independent variable have been
administered, all research participants are measured, in the same way, on the
same dependent variable; a variable in which the changes are results of the level
or amount of the independent variable(s); also, the variable whose variations are
of most interest to the researcher; when used with correlation or regression, it is
referred to as the outcome variable.
Descriptive (Data) Analysis: methods used for summarizing and describing
data in a clear and precise manner; strictly speaking, descriptive analyses apply
only to the people (or objects) actually observed; methods for data reduction.
Dichotomous Variable: a variable that can take on only one of two values.
Discrete Variable: a variable that can assume only a finite number of values.
Dummy Variable: a variable that is created by converting a qualitative variable
into binary variables.
Extraneous Variable: (see intervening variables) – a variable whose existence is
inferred, bat that cannot be manipulated; a variable that may affect just what
influence (if any) an independent variable has on a dependent variable; also
referred to as a confounding variable; when controlled for in a research design, it
is known as a control variable. In its most specific usage, a variable that may
have come between (in time) the introduction of the independent variable and the
dependent variable and may thus have affected the latter.
Frequency: number of observations falling in a cell or value category of a
Independent Variable: the variable we believe to be associated with the different
values of the dependent variable; the variable that is manipulated or introduced
in a research study in order to see what effect differences in it will have on those
variables proposed as being dependent on it.
Inferential Analyses: statistical methods that make it possible to draw tentative
conclusions about the population based on observations of a sample selected
from that population and, furthermore, to make a probability statement about
those conclusions to aid in their evaluation.
Interval: (measurement) a measurement that, in addition to ordering scores, also
establishes an equal unit so that distances between any two scores are of a
known magnitude; a measurement in which objects, events, or processes are
assigned to ordered categories that are separated by equal intervals; any
measuring device that is capable not only of placing people (or objects) in their rank order on a characteristic but can also measure the differences between
them in regard to that characteristic.
Multivariate Analysis: a statistical analysis of the simultaneous relationship
among three or more variables.
Nominal: (measurement) a measurement that simply classifies elements into two
or more mutually exclusive categories, indicating that elements are qualitatively
different but not giving order or magnitude; a measurement in which objects,
events, or processes are assigned to categories having no inherent order; the
level of measurement whose only requirement is that each observation falls in
one, and only one, measurement category; also referred to as categorical
measurement; it is the lowest level of measurement.
Null (Research) Hypothesis: a statement concerning one or more parameter(s)
that is subjected to a statistical test; a statement that there is no relationship
between the two variables of interest; the belief that any apparent relationship
between or among variables in one or more research samples has been caused
by sampling error; the hypothesis that is tested when seeking to gain statistical
support for a one-tailed or two-tailed research hypothesis.
One-tailed Research Hypothesis: a form of research hypothesis in which the
researcher predicts that a statistically significant relationship between variables
will be found and also predicts the direction of that relationship.
Ordinal: (measurement) a measurement that classifies and ranks elements or
scores; a procedure that is capable of rank ordering individuals (or objects) on a
particular characteristic but that cannot distinguish how different each is from the
others; a measurement in which objects, events, or processes are assigned to
ordered categories; the level of measurement above nominal but below interval;
the data represent at least ordinal scale measurement if each observation falls
into one, and only one, category and if observation categories can be rank
Outcome Variable: the variable whose values can be predicted by values of the
predicator value; sometimes called the criterion variable.
Parameters: a characteristic of a population determined from observations on
every member of the population; population parameters of interest to us include
the mean, range, median, standard deviation, and many others; also a
characteristic of a mathematical relation whose value must be specified before
the expression can be evaluated; a measure computed from all observations in a
Population (Distribution): a distribution of all the scores in a population; a
collection of all observations identifiable by a set of rules; a designated part of a universe from which a sample is drawn; the complete group of potential
Predicator Variable: the variable that, it is believed, allows us to improve our
ability to predict values of the outcome variable.
Ratio: (measurement) a measurement that, in addition to containing equal units,
also establishes an absolute zero point within the scale; a measurement in which
objects, events, or processes are assigned to ordered categories that are
separated with equal intervals, and where the zero point is not arbitrary; the
highest level of measurement; it is reached only when observation falls in one,
and only one, category; when observation categories can be ordered; when there
are equal intervals between adjacent categories on the measurement scale; and
when a value of zero represents a zero quantity of the variable being measured.
Reliability: the consistency of a measurement instrument.
Research Hypothesis: a prediction of the relationship between two or more
variables; when using one-tailed or two-tailed research hypotheses, the
hypothesis to be supported if the null hypothesis rejected; also called the
Two-tailed Research Hypothesis: a form of research hypothesis in which the
researcher predicts that a statistically significant relationship between variables
will be found but does not predict the direction of that relationship.
Univariate Analyses: statistical analysis of the distribution of values of a single
Validity: the degree to which a measurement instrument accurately measures
what it is supposed to measure.
Variable: a characteristic that takes on different values; any attribute whose
value, or level, can change; any characteristic (or a person, object, or situation)
that can change value or kind from observation to observation.
Chapter 2 – Frequency Distributions and Graphs
Absolute Frequency Distribution: a table that displays the frequencies for
various measurements of a variable.
Bar Graph: a graphical technique of descriptive statistics that uses the heights of
separated bars to show how often each score occurs; graphical representation of
a frequency distribution table in which each measurement category is
represented by a bar that extends to the appropriate distance in the frequency
dimension; usually has spaces between bars to represent nominal level data. Cumulative Frequency Distribution: a frequency distribution that gives the
number of scores that occur at or below each value of a variable.
Cumulative Percentage (Frequency) Distribution: a table that shows what
percentage of scores occurs at or below each value of a variable.
Frequency Polygon: a graphic technique of descriptive statistics that uses the
height of connects dots to display the shape of the distribution in which the
horizontal axis represents different values of a variable and the vertical axis
represents constructing a frequency polygon, a dot is placed over each value of
the variable at a height corresponding to the appropriate frequency; the dots are
then connects with lines to form a polygon.
Grouped Cumulative (Percentage) Distribution: an extension of a grouped
frequency distribution that shows how often scores occur at or below each
Grouped Frequency Distribution: table or graph in which frequencies are not
listed for each possible value of the variable; rather, a frequency is listed for each
of a number of intervals on the measurement scale; each interval is a range of
values; all observations falling within the limits of the interval add to the
frequency count for that interval; grouped frequency distributions are used most
often when data represent observations on a continuous variable.
Histogram: a graphic representation of a frequency distribution in which the
horizontal line represents values of a variable and the vertical line represents
frequencies with which those values occur; a bar is constructed over each value
of the variable (or the midpoint of each interval, if that data are grouped) and
extended to the appropriate frequency; the term histogram usually refers to such
a graph for interval or ratio data, whereas the term bar graph usually refers to
such a graph for nominal or ordinal data; a graphic technique of descriptive
statistics that uses the height of adjoining bars to show how often each score
Ordinate: (ordinal measurement) a measurement that classifies and ranks
elements or scores; a procedures that is capable of rank ordering individuals (r
objects) on a particular characteristic but that cannot distinguish how different
each is from the others; a measurement in which objects, events, or processes
are assigned to ordered categories; the level of measurement if each observation
falls into one, and only one category and if observation categories can be rank
Percentage (Frequency) Distribution: a table that displays percentages of
cases that were found to have each of the respective measurements of a
variable. Percentile: a point on the measurement scale below which a specified
percentage of the group’s observations fall; the 20 percentiles, for instance, is
the value that has 20 percent o the observations below it.
Percentile Rank: a transformed score that tells us the percentage of scores
falling at or below a given score.
Pie Chart: (pie graph) a graph that displays the frequency distribution of a
variable as portions of a circle reflecting percentages of the whole.
Frequency Distribution: (Simple) a table or graph that presents the number of
times (frequency) with which different values of the variable occur in a group of
observations; a technique of descriptive statistics that shows how often each
Stem-and-Leaf Plot: a graph consisting of numbers that reflect the actual case
values of all cases in a frequency distribution.
X-variable: the variable plotted on the x-axis of a scattergram and predictor
variable (used to predict the y variable) in regression; usually the independent
variable in a research study.
Y-variable: the variable plotted on the x-axis of a scattergram and predicted
variable (predicted from the x variable) in regression; usually the dependent
variable in a research study.
Chapter 3 – Measures of Central Tendency and Variability
Bimodal: a frequency distribution with two modes reflecting equal or nearly equal
Box Plot: a graph that reflects both the central tendency and variability of the
distribution of a variable. In one of its most common variations, lines are used to
indicate the five-number summary that is, the minimum value, th