STAT 2060 Lecture Notes - Lecture 2: Box Plot, Bias Of An Estimator, Unimodality
254 views14 pages
STAT*2060: Statistics for Business Decisions
Week 2 Lectures
1 Measures of Central Tendency
The most common measures of central tendency are the mean, median, and mode.
The mean is the average of all observations.
When we are working with a sample, the sample mean is denoted ¯x, where:
EXAMPLE: Suppose we have the numbers -2, 4, 8, 13, 19. We can calculate the sample mean as:
¯xis a sample mean, but it is often used to estimate the population mean, µ(“mew”).
The mean uses all data points in its calculations (which is good!), but is therefore sensitive to extreme
values (which is bad!)
EXAMPLE: Suppose we have the numbers -2, 4, 8, 13, 190. The sample mean is:
The median is the middle value when the data is ordered smallest to largest.
If nodd, median is the middle value. If nis even, median is the average of the two middle values.
EXAMPLE: Suppose we have the numbers 2, -5, 18, 9, 0. We can determine the median as:
Average of all data samples (a+b+c/3)
Check out calculator
x bar used to estimate mew
x bar is statistic
190 pulled mean toward it from 8.4-->42.6
Thus extreme values in data set can result in a mean that is not reflective of most of the data.
-5,0,2,9,18 therefore, 2 is the median
EXAMPLE: Now suppose we have the numbers 2, -5, 18, 9, 0, 11. We can now determine the median as:
The median only uses the values of the middle number(s) in its calculations (which is o.k.), but is
therefore less sensitive to extreme values (which is good!).
EXAMPLE: Consider the values from above, which were 2, -5, 18, 9, 0. The median was found to be
2, and the mean we can calculate to be 4.8. Suppose the value of 9 is changed to 90. The new mean
and median are now
The mode is the most frequently occurring observation. It is possible to have one (unimodal), two
(bimodal), or even more (multimodal) modes in a single set of data.
EXAMPLE: Suppose we have the numbers 3, 4, 5, 5, 6, 6, 6, 6, 7, 7, 8, 8, 9. We can determine the
The mode is not necessarily that useful, and can be misleading!
-5,0,2,9,11,18 therefore (2+9=11/2=5.5) 5.5 is median
Therefore, median not influenced, whereas the mean changes significantly
the data occurring most often, in this case it is 6
Bimodal: 3,4,5,5,5,5,6,6,6,6,7,8 Therefore, both 5 & 6 occur as frequently thus, this is a bimodal sample
We can make general conclusions about the location of the mean and median (and even the mode) by
examining the shape of the distribution of the data:
Mode doesn't imply Majority, just that number appears most frequently in
Symmetric: mean & median are roughly same
Median: roughly 5
Mean: roughly 5
Left-skewed (Assymmetric): mean less than median
Median: roughly 4.8
Median: less than mean
Mean: greater than median
Mean & Median roughly equal