CAS MA 115 Lecture Notes - Lecture 3: Point Estimation, Frequency Distribution, Regional Policy Of The European Union
CHAPTER 3 – NUMERICALLY SUMMARIZING DATA
Section 3.1 – Measures of Central Tendency (what is happening to data on average)
Objective 1: Determine the Arithmetic Mean of a Variable from Raw Data
• Arithmetic Mean (of a variable) – computed by adding all the values of the variable in the
data set and dividing by the number of observations
o Population Arithmetic Mean (μ) – computed by using all the individuals in a
population
▪ Important to note that this is a parameter
▪ If X1 + X2 +…+XN are N observations of a variable from a population,
then the population mean:
▪ μ =
=
o Sample Arithmetic Mean (x
̄) – computed by using sample data
▪ Important to note that this is a statistic
▪ If X1 + X2 +…+Xn are n observations of a variable from a population, then
the population mean:
▪ x
̄ =
=
▪ Point Estimate – a single value used to estimate the population arithmetic
mean
• Not entirely accurate but a good base point
• Sample Size and Interval Length are interrelated in determining the
point estimate
• When to Use: When data is quantitative and the frequency distribution is roughly
symmetric
Objective 2: Determine the Median of a Variable from Raw Data
• Median (of a variable) (M) – the value that lies in the middle of the data when arranged in
ascending order
o Steps to Find the Median:
▪ 1. Arrange the data in ascending order
▪ 2. Determine the number of observations (n)
▪ 3. Determine the observation in the middle of the data set
o If the data set is odd, the median is the observation in the
position
o If the data set is even, the median is the observations in the
+ 1position
• When to Use: When data is quantitative and the frequency distribution is skewed
left/right due to outliers
Objective 3: Explain What It Means for a Statistic to be Resistant
• Resistant Numerical Summary of Data – if extreme values (very large/small) relative to
the data do not substantially affect its value (ex: resistant – range, IQR, median v. non-
resistant – standard deviation, variance, mean)
• Relation Between Mean/Median/Distribution Shape
find more resources at oneclass.com
find more resources at oneclass.com
o skewed left (smaller values) = mean is substantially smaller than the median
▪ The data set contains drastically smaller observations
o symmetric bell-shaped = mean is roughly equal to the median
o skewed right (larger values) = mean is substantially larger than the median
▪ The data set contains drastically larger observations
• The mean will always be more affected than the median (because the mean accounts for
individual values while the median accounts for the total count)
• When to Use: When data is skewed, use the median. When data is symmetric, use the
mean.
Objective 4: Determine the Mode of a Variable from Raw Data
• Mode (of a variable) – the most frequent observation of the variable that occurs in the
data set
o Data can have no/one/more than one mode
▪ ex: none of the numbers occur more than once = no mode
▪ ex: all of the numbers occur three times = all the numbers are the mode
• When to Use: When the most frequent observation is needed or if data is qualitative
Section 3.2 – Measures of Dispersion (what is happening to data with outliers)
Objective 1: Determine the Range of a Variable from Raw Data
• Range (of a variable) (R) – the difference between the largest and smallest data values
o Formula = (largest data value – smallest data value)
Objective 2: Determine the Standard Deviation of a Variable from Raw Data
• Population Standard Deviation (of a variable) (σ) – the square root of the sum of squared
deviations about the population mean divided by the number of observations in the
population
o Formula =
o Computational Formula =
find more resources at oneclass.com
find more resources at oneclass.com
Document Summary
Section 3. 1 measures of central tendency (what is happening to data on average) Important to note that this is a parameter. If x1 + x2 + +xn are n observations of a variable from a population, then the population mean: = x(cid:2869) + x(cid:2870) + +xn. = (cid:3046)(cid:3048)(cid:3040) (cid:3042)(cid:3033) (cid:3039)(cid:3039) (cid:3049)(cid:3039)(cid:3048)(cid:3032)(cid:3046: sample arithmetic mean (x ) computed by using sample data. Important to note that this is a statistic. If x1 + x2 + +xn are n observations of a variable from a population, then the population mean: x = x(cid:2869) + x(cid:2870) + +xn. Objective 2: determine the median of a variable from raw data: median (of a variable) (m) the value that lies in the middle of the data when arranged in ascending order, steps to find the median, 1. Determine the number of observations (n: 3. Section 3. 2 measures of dispersion (what is happening to data with outliers)