Study Guides (380,000)
CA (150,000)
Carleton (5,000)
PSYC (800)
Midterm

# PSYC 3000 Midterm: Cheat Sheet (Midterm #1)

Department
Psychology
Course Code
PSYC 3000
Professor
Bruce Hutcheon
Study Guide
Midterm

This preview shows half of the first page. to view the full 2 pages of the document.
Line Plots: time and sequence #. Shows all data points. Quickly
view single or group of outliers. Detect sudden change in mean
and variance. Data must be plotted in order collected. Show all
the data in a dataset sequentially.
Scatter Plots: shows all data points. 2 variables for same
participant. Reveals unusual relationships between variables
rather than unusual values.
Histograms: Do not show all data, only processed data. Visually
assesses symmetry, center, etc.
Boxplots: The upper quartile is the upper hinge and Q3.
The lower quartile is the lower
hinge and Q1.
The median is Q2.
Extremes shown are known as
fences.
.**note that fences are not
necessarily marked by the end of
the wiskers**
Dots are outliers while * is an
extreme outlier.
About 99% of all data lies between
the fences.
The width of the box indicates
dispersion while the placement of Q1,2, and 3 indicates the
symmetry.
Fences also can indicate symmetry. However, you can never
asume the shape of a distribution based on the shape of a
boxplot.
Mean: Interval or ratio data. Mathematically easy to use but is
strongly affected by extreme scores.
Median: Ordinal data. Resistant to extreme scores but not easy
to use for analyses.
Mode: Nominal data. Useful for even categorical data but is
strongly affected by sampling fluctuations.
Variance: average squared deviation from mean. Easy to use
for analyses but affected by extreme scores.
Standard Deviation: square root of variance. Intuitive but it is
the same as the variance.
Semi-interquartile range: (Q3-Q1)/2. Not affected by extreme
scores but affected by sampling fluxuations
IQR: Interquartile range. Q3-Q1, space between upper and
lower hinge.
Error Bars: Indicate +-1 SD or SE, unless indicated otherwise.
Can be used to compare distributions.
Hypothesis Testing1. Decide what stat you are interested in.
2. Make hypothesis about nature of population sample is from
3. Construct sampling dist.
4. Find observed value of stat
5. See where observed value fits into population
Test of the Mean: 1. Hypothesis about the pop. mean µ.
2. “Sampling distribution of the mean”
3. Stat calc. in sample is sample mean, M
**We could do the same for skew, variance, etc. just sub in the
statistic you want to use**
p-values: When the observed value is placed in the population,
the p-value is the amt. of space cut off in the tail. Note that these
are very different than alpha levels.
Normal Distributions: In every Normal distribution, the most
extreme 5% of values (2-tailed) lie 1.96 SD or more from the
mean. The most extreme 1% of values lie more than ± 2.58
standard deviations from the mean. For normal population
distributions, the sampling dist. is z.
Expected Value: average of stat in its sampling distribution,
written in <statistic>.
Standard Error: SD of stat within Its sampling distribution,
θ statistic.
Generic formula – How many SE from exp. Value
-
statistic
statistic statistic
**This yields a TEST STATISTIC. Only if the distribution is
normal is it called z. (θ is SD)
Test of the Sample Mean: If a population is normal, so is the
sampling distribution of the mean. if the population has mean µ
and standard deviation θX, then
θX= Pop. SD, θM= SE or SD of sampling distribution
Tests of Normality: Normal distributions have a skew AND
kurtosis of zero. If either is off, it is NOT normal!!
Test of sample Skew:
Once you get a z value, you can see if the skew is significantly
different than zero using p-values. (remember +-1.96)
Test of Sample Kurtosis:
Same as skew, see if kurtosis is significantly different from 0.
Central Limit Theorem: With a sample size of 30 or larger, it is
safe to assume normality.
General z-test: tells you how many standard errors separate an
observed sample mean from its
expected value. Can only be used with
a normal sampling distribution.
Single sample t-tests: Often we don’t know the value of the
standard deviation ahead of time. In that case we move on to t-
tests where the population standard deviation is estimated.
** remember s is sample
SD, θ is pop. SD. We can
still use the cut off value of 1.96 as long as n>30. **ONLY FOR
NORMAL DIST**
Degrees of Freedom: When a sample size is n>30, you can use
the same cut offs as z, but if it is not you must use df. We must
report df with our t-value. ex. t(17)=0.22. You can look up your t-
values on a table to see if they are significant at a given alpha.
**
For a single sample test df = n-1
Independent Samples t-test: There are two samples and even
if we know the value of an observation in one of the samples, it
tells us nothing about the value of any of the observations in the
other sample.
For this test, the hypothesis is that there are not really 2
populations here, there is only one, and both samples are drawn
from it
Sometimes, these two samples can be a control and treatment.
** Remember <M1-M2> is zero
** df for samples n<30 df=(n1-1)+(n2-1)
**Homogeneity of variance is also needed for this test to work
Levene’s Test for Homogeneity of Variance: tests the null
hypothesis that 2 or more samples come from populations
having the same variance. Looks like this when reported:
F(1,19) = 1.33 1=# samples-1, 19=df for 2 sample test
SPSS gives the significant p-value as well as the F value, so we
do not need to calculate this.
Pooling Variance: If Levene’s test is passed, it means the
samples came from populations with the same variance. In a t-
test the sample variances function as estimates of the population
variance. But we have 2 samples and so 2 estimates. So which
estimate are we supposed to believe?
In statistics, whenever you have multiple estimates of exactly the
same thing you should pool them together to make a single,
higher-quality, estimate
S-pooled is THE BEST estimate of population SD. So, we can
use it instead of sx in the t-test.
**If Levene’s test is not passed, do not pool the variance! Just
use the other t-test for independent samples above
2 2 2 2
1 1 2 2
1
... , 1
k
pooled k k i
i
s w s w s w s w
 
X
Mn
M
-
X
M
zn
6
skew
n
0skew
observed - expected 0
standard error 6
skew
skew skew skew
z
n
� �
� �
� �
24
kurtosis n
observed - expected 0
standard error 24
kurtosis
kurtosis kurtosis kurtosis
z
n
� �
� �
� �
X
M
zn
MX
M M M
tss n
 
 
1 2
1 2 1 2
M M
M M M M
ts
 
1 2
1 2
2 2
1 2
M M
X X
s s
sn n
 
1 2 1 2
2 2
1 2
1 2
1 1
pooled pooled pooled
M M M M
t
s s sn n
n n
 
 