Measure of center, measure of spread, variance, standard deviation, interquartile range, boxplots

24 August
A parameter is a value that pertains to the population
Denoted by a Greek letter
A statistic is a value that pertains to a sample
Denoted by a Roman letter
Numbers to describe data
Measure of center
For numeric variables:
x
Mean (μ for parameter, for statistic)
Median
Mode
≈ ≈
If histogram is symmetric, mean median mode
If histogram is right skewed, mean > median > mode
If histogram is left skewed, mean < median < mode
For categorical variables: proportion of “successes”
̂ p
π for parameter; , p, for statistic
Still need to find a measure of spread, as just knowing where the center of the data is
does not give us all the information we need.
Measures of spread – provides a measure of uncertainty
Range = max – min
Deviation from sample mean
How far a particular observation is from the sample mean
X - x
i
Xi= the ith value from data set Measure of spread would need to include all deviations
n
∑ (x −́ )0
Can’t take average of deviations, as i=1 i
Say x = 3, x = 4, x = 6
1 2 3
3
∑ xi = 3 + 4 + 6
i=1
Σ indicates a summation
Variance
Average squared deviation from the mean
n
2 ∑ (x1−́x)2
S = i=1

