Professor
Jeff Racine
Semester
Fall

Description
Lecture #5 (Chap 1 & 2 Continued) Dispersion Measures • The Variance (Average (Squared) Deviation) -Most common measure of dispersion -Measures the typical squared deviation about the center of the data using the arithmetic mean. -Calculated by averaging the squares of the individual deviations from the mean Calculation: Variance: Population: Sample: -In R we use var( ) • The Standard Deviation o Definition: A measure of how spread out or dispersed the data in a set are relative to the set's mean (from website). o The positive square root of the variance o Falls in the same range of magnitude (and appears in the same units) as the observations themselves. Calculation: Population: Sample: • Overall Range -Range: measures the spread of the data -Equals the difference between the largest and the smallest observation in a data set (in R we could use the range ( ) function). • Interfractile Ranges (2 numbers that contain half the data) -Definition: Measure difference between 2 values (called fractile or percentile) in the ordered array. o Quartiles: divide the array into 4 quarters o Interquartile Range: difference between 3 and 1 quartiles (contains middle 50% of data) o Deciles: divide the array into 10 parts at the center of the range. 5 decile (ie: the median) • In R we use the IQR ( ) and quantile ( ) functions Shape Measures: Skewness • Definition: a measure of symmetry • If the data is right tailed, you know the data is skewed (ie: seen in income distributions, where a lot of people are centred around average income) • A frequency distribution’s degree of distortion from horizontal symmetry • Person’s (first) coefficient of skewness is (is it + or -)? Skewness= mean-mode Standard Deviation • An alternative (more popular, default in R) moment-based measure is given by: o Skewness=0 for symmetric distributions o For right skewed distribution; the mean is greater than the median which is greater than the mode. Right Skewed= (mean>median>mode) Kurtosis • Definition: How peaked a distribution is • If you get a value less than 3 it means it is less peaked than the normal distribution (If more than 3 opposite) • Coefficient of kurtosis Kurtosis= o Kurtosis = 3 for the normal distribution o In R we first install the moments package via: install. packages(“moments”) o The functions kurtosis ( ) and skewness ( ) can then be accessed. The Five-Number Summary • The Five-number summary of a set of observations consists of the smallest observation, the first quartiles, the median, the third quartile, and the largest observation, written in order from smallest to largest. Minimum, Q1, Median, Q3, Maximum (Q1= a quarter of all observations below and ¾ above; Median=Half of all observations below, and half above; Q3= a quarter of all observations above
