Chapter 5

ECO220Y1 Chapter 5 (5.7-5.14) Notes

ECO220Y1 Textbook Notes Chapter 5: Displaying and Describing Quantitative Data (5.7 – 5.14) 5.7 Grouped Data  The mean can be calculated from grouped data by multiplying midpoints by the % of people who chose that option and adding the results.  You can use the midpoints of ranges in the regular formula for variance and also multiply by the % (p) of the sample in that group: ̅ 5.8 Five-Number Summary and Boxplots  Five-number summary of a distribution reports its median, quartiles, and extremes.  A five-number summary of a quantitative variable can be displayed in a boxplot. o Boxplot: a boxplot displays the five-number summary as a central box with whiskers that extend to the non-outlying values.  Particularly effective for comparing groups.  Steps to create a boxplot: o Draw a single vertical axis spanning the extent of the data. o Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box (width not important unless multiple groups being shown). o Put fences (don’t show in final graph) around the main part of the data, placing the upper fence 1.5 IQRs (Q3 – Q1) above the upper quartile and the lower fence 1.5 IQRs below the lower quartile.  I.e. Q3 + 1.5*IQR = upper fence. o Grow “whiskers”. Draw lines from each end of the box up and down to the most extreme data values found within the fences. o Add any outliers by displaying data values that lie beyond the fences with special symbols.  Outliers that are < 3 IQRs from the quartiles with one symbol; outliers that are > 3 IQRs from the quartiles with another symbol.  Features: o The boxes in the centre of the boxplot show the middle half of the data. o The height of the box = IQR. o If the median is roughly centered between the boxes, the data is roughly symmetric (if not the distribution is skewed). o If the whiskers are not roughly the same length, the distribution is skewed. 5.9 Percentiles  Q1 can be thought of as the 25 percentile (25% of the data below it).  Q3 can be thought of as the 75 percentile. th  The median is the 50 percentile.  Percentile: a value below which a given percentage of data lies. 5.10 Comparing Groups  Histograms are best at displaying one or two distributions.  Boxplots usually do a better job at displaying more than two distributions. o They offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information.  You can see which group:  Has the higher median.  Has the greater IQR.  Where the central 50% of the data is located.
