# PSYB07 TUTORIAL NOTES

Population vs. Sample
Population entire collection of events/participants for the experiment
Sample a small representation of the population used for the experiment
Random sampling method every individual in the population should have an equal chance of being
chosen (no inherent differences between sample and population; must do your best for sample to be
random).
*also known as external validity
Random assignment assign different condition of the study to each and different individual
*also known as internal validity
-when conduction an experiment, there are two variables: independent and dependent
Independent Variable something in our control, we can manipulate it
Dependent Variable depends on the independent variable (result of the study)
i.e. effect of drug on memory
No Drug
Drug A
Drug B
Drug C
Independent variable: selection of drugs, gender
Dependent: memory scores
-greater memory in drug C group
-in general, males have higher memory than female scores
Discrete variable can take on limited number of values
-drug and memory experiment is discrete (only four options for drugs and two options for gender)
Continuous Variable no restricted limits
Categorical Data divided into categories
- drug and memory experiment is categorical
Measurement Data if data is being measured (against some scale)
-therefore, memory scores: continuous (no limit, can take on any value) and measurement data
Summation Notation
∑(x) = x + x + x + … + x
1. 2 12 22 3 2 n 2
2. ∑x = x +1x + 2 + … 3 x n
(∑x) = (x + x + x + … + x ) 2
3. 2 1 2 3 n
4. ∑(x-y ) square y first, subtract it from x and add it all up
∑(x-y) subtract first and then square the sum
5.
Measures of Central Tendency
Mean ȳ = ∑y/n
∑y = n ∙ ȳ
Grand mean: ∑nȳ j j
∑n j
*this is used when you don’t have the actual data
Set of Scores: 1, 2, 5, 5, 5, 7, 7, 8, 8, 9, 9, 10, 10
th th th
Median location = n + 1 = 12 + 1 = 6.5 position (median between 6 and 7 number)
2 2Median: 7.5
Mode: 5
-Median and mode are resistant to outliers (outliers draw man and median towards themselves).
Measures of Spread
Range the difference between the lowest score and the highest score
-sensitive to outliers
Interquartile Range: first and last 25% of the scores are removed
i.e. 1, 2, 5, 5, 5 | 7, 8, 8, 9, 9 | 10, 10, 11, 12, 15 | 16, 16, 16, 17, 20
Q1 median Q3
Range: 19
Median: 9.5
Interquartile Range (IQR): Q3-Q1 = 15.5 – 6 = 9.5
Variance
Variance average sum of the square deviance
*the square is to ensure that when you add them up, you don’t get zero
(y - ȳ) (y - ȳ)
1 -5 25
1 -5 25
2 -4 16
5 -1 1
5 -1 1
5 -1 1
7 1 1
8 2 4
8 2 4
8 2 4
9 3 9
9 3 9
10 4 16
ȳ = 6 2 2 0 116
Population variance: σ = (y - ȳ)
n
-this is used only when you have the population or when you don’t need to generalize the sample to the
population
Sample variance: s = (y - ȳ)
n-1
-this is used when you have the population
*assume that a question is asking for sample variance unless explicitly says population
Why is n-1 used instead of n for the sample variance?
Sample variance loses a degree of freedom this is because if n is used instead, it underestimates the
number of the population (so smaller than the population) and n-1 compensates for this the value of
the sample variance increases when the denominator is smaller as it becomes when it is n-1.Histograms
Scores: 9, 9, 5, 8, 7, 2, 2, 8, 5, 10, 6, 9
Raw Score Frequency Cumulative Frequency Relative Frequency
0 0 0 0 0
1 0 0 0 + 0 0
2 2 2 0 + 0 + 2 2/12
3 0 2 0 + 0 + 2 + 0 0
4 0 2 0 + 0 + 2 + 0 + 0 0
5 2 4 0 + 0 + 2 + 0 + 0 + 2 2/12
6 1 5 0 + 0 + 2 + 0 + 0 + 2 + 1 1/12
7 1 6 0 + 0 +

