Department

StatisticsCourse Code

STATS 13Professor

Tsiang, MikeLecture

19This

!

Boxplots!

a plot which represents the distribution by abox!•

the left edge of the box is at the ﬁrst quarter (Q1) and the right edge of the box is at the third quartile (Q3)!◦

the length of the box is proportional to the interquartile range (IQR)!◦

a vertical line inside the box marks the median!◦

horizontal lines called whiskers extend from the ends of the box to the smallest and largest values in the ◦

data, up to a distance of 1.5 x IQR!

any values extending beyond the whiskers are dots-- potential outliers!◦

!

Box plot example: heights!

the heights of 50 self identiﬁed female students at hope college were measured in inches in the 1990s!•

fairly symmetric (slightly left skewed) the potential outliers are likely not actually outliners in the context!•

!

!

!

!

!

!

!

!

!

!

!

Boxplots at a glance!

give an idea of the shape of the distribution and ﬂag potential outliers!•

useful when comparing distributions!•

ﬁve number summary: min, q1, median, q3, max!•

!

Example 6.1 Geyser Eruptions!

consider data on inter eruption times from two diﬀerent •

years to see whether the distributions of these times

appear to have changed!

!

!

Geyser Eruptions: Summary!

median in 1978 is smaller than the median in 2003!•

75% wait times were shorter!•

third quartile is smaller!◦

less variability in the wait times!•

!

Boxplots: Caveats!

can lose some important details about the shape of the data!•

not able to visualize multimodality or clusters!•

useful to use in combination with dot plots and histograms!•

!

Section 6.2: Comparing Two means: Simulation based approach!

!

