# PSYC 300B Chapter Notes - Chapter 12: Standard Score, Descriptive Statistics, Contingency Table

by OC757575

This

**preview**shows half of the first page. to view the full**2 pages of the document.**•frequency data — when data are tallies or frequency counts:

-they are qualitative & represent a nominal scale of measurement

-statistical tests used are called non-parametric tests because their

descriptive statistics are not used to estimate population parameters

-the choice of the statistical test used to analyze the data depends on:

(i) sample size

(ii) type of research design

(iii) model of hypothesis testing

•statistical tests — for each model of hypothesis testing:

-random sampling — outcomes give a 𝜒² test value & an associated

estimated value for p(obs)

(a) chi-square 𝜒² test

(b) 𝜒² goodness of ﬁt test

(c) 𝜒² contingency tale — split into the 𝜒² test of independence &

𝜒² test of homogeneity

(d) z-corrected test

-random assignment — outcomes are an exact value for p(obs)

(a) binomial exact test (simplest case)

(b) Fisher’s exact test

nT = total number of tallies = total number of participants

k = number of “questions” (variables)

c = number of possible “responses” (categories within the variable)

•frequency analyses — general characteristics of all frequency analysis tests:

(i) the independent variable(s) has discrete categories, & is therefore

of a nominal or ordinal scale of measurement

(ii) the dependent variable is qualitative data — measured as a

frequency count or tallies for each category of the IV

(iii) each tally is an independent occurrence — each participant can

only contribute just 1 response to just 1 category

-nT = the total number of responses (tallies)

(iv) tests are non-parametric — no population parameters are estimated

& decisions about the null does not require random sampling

(v) statistical analyses are a test of the null hypothesis — tests whether

the data are evenly distributed across the categories/conditions, & the

probability that this distribution is a random occurrence

-compare the observed distribution of frequencies against the

expected distribution of frequencies

•chi square (𝜒²) statistic: the best known & most utilized option for the

analysis of tallies which can be used with all 3 design options

-applied to random sampling model of hypothesis testing

-when E is < 5 for one or more categories, the 𝜒² may result in a biased

overestimate of the p(obs) value, resulting in the increased probability of

making a type I error

•avoid this by applying the Yates-correction, or using a design-

appropriate tests from the random assignment model

-𝜒² tests the null hypothesis of “the probability that a distribution of

observed frequencies come from a population where responses are evenly

(or proportionally) distributed across the categories”

-does not rely on population parameters to test the null hypothesis

-assumes that we have a good estimate of the expected chance

occurrences in the population

-O (fo) = observed # of actual tallies in each category

-E (fe) = expected tallies given the hypothesized population distribution

(may assume to be equally or proportionally distributed across categories)

-𝜒²(obs) = the degree to which observed frequencies (O) diverge from

expected frequencies (E)

•the computed value is compared to a table of values for the

theoretical distribution of 𝜒² statistic

•analysis determines the probability that a distribution of tallies is a

random distribution

•if p(obs) is small, then it is more likely that the participant’s responses

reﬂect a speciﬁc preference

•report results as 𝜒²(df, n = ?) = x, p > or < ?

-assumptions for 𝜒² test:

(i) each observation is independent of every other observation

(ii) the categories are exhaustive, such that adding tallies in each

category must equal nT

(iii) nT = 20 (or more)

(iv) E ≥ 5 for each category

(v) data are qualitative (frequency counts)

(vi) IV is measured only on a nominal or ordinal scale

-characteristics of the null distribution for 𝜒² statistics:

(a) numerically, it is a squared z-distribution

(b) is a theoretical, mathematically deﬁned distribution

(c) X-axis is deﬁned by a standard score (𝜒²)

•X-axis only has positive values ranging from 0 to ∞

(d) the shape of the distribution is a family of curves that vary as a

function of df

•when value for df is small, distribution is positively skewed

•as value for df increases, the distribution is more symmetrical

(e) larges values for 𝜒²(obs) assumes there is a small probably

distribution of tallied occurred by chance

•use value for 𝜒²(obs) & df to estimate the probability that your

distribution of frequencies represents a chance occurrence

•𝜒² tests — used for random assignment or sampling research designs:

(a) 1 variable with 2 categories — k = 1, c = 2 & nT ≥ 20 !

!

df = (c - 1)

(b) 𝜒² goodness of ﬁt — k = 1, c ≥ 3, nT ≥ 20 or E ≥ 5

-a single same case with discrete data

-𝜒²(crit) is based on df = (c - 1)

-tests the “goodness of ﬁt” — how well the observed distribution

of frequencies corresponds to some theoretical or expected

frequency distribution, or that the obtained data ﬁts with the

expected outcome of equal responses

-2 variations, both of which use the same original formula, but just

calculate the value for E differently

(i) E is equally distributed across categories

•compute E for each category, in which E is distributed

equally (the same value) across categories !

(ex. 20, 20, 20 for nT = 60)

(ii) E is proportionally distributed across categories

•compute E for each category, in which E is based on a

proportion of the total nT!

(ex. 200, 120, 80 for nT = 400)

(c) 2 x 2 (or larger) contingency table — k ≥ 2, c ≥ 2, nT ≥ 20 or E ≥ 5

-responses are recoded on 2 categorical variables

-tests whether the distribution of values across one variable is not

contingent on level of the second variable

-2 variations, both of which use the same original formal, but

calculate E for each category from the marginal values

(i) 𝜒² test of independence: a single-sample test of

contingency between 2 categorical variables

•take 1 sample from your hypothesized population &

examine it on 2 different variables

•do not know marginal values until after frequencies are

tallied & categorized

•if the 2 variable are independent, then the distribution

of the frequencies among the categories in any

particular column should be in proportion to the

distribution of the frequencies among the rows overall

df = (# of rows-1) (# of columns - 1)

PSYC 300B - Chapter 12: Analysis of Frequency Data

###### You're Reading a Preview

Unlock to view full version