Chapter 4

This

CHAPTER 4 – Displaying and Describing Categorical Data

Make a picture

- display data ! help see what you are not likely to see in table ! help plan approach to

analysis

- shows important features, patterns and relationships

- reveals extraordinary (or possible wrong) data

- best way to report data to others

Frequency Tables â€“ shows number of cases (ex. website visits) for each category and records

totals and category names (ex. provinces)

- describe the distribution of a categorical variable â€“ name possible categories and tell how

frequently each occurs.

Relative frequency table â€“ displays percentages, rather than the counts, of each of the value in

each category

Charts:

- The area principle â€“ the area occupied by a part of the graph should correspond to the

magnitude of the value it represents

- Bar charts â€“ displays the distribution of a categorical variable, showing the counts for each

category next to each other for easy comparison

- more accurate visual impression of the distribution

- common base, freestanding, spaces in-between

- horizontal or vertical

- Relative frequency bar chart â€“ replacing counts with percentages, draws attention to

proportion

- Pie Charts â€“ severe perceptual problems, hard to interpret â€“ try not to use them!

Categorical Data Condition â€“ that the data are counts or percentages of individuals in

categories

- make sure categories donâ€™t overlap

** best perception of â€“ positions of common scale (ex. plot or bar graph), comparing 2 separate

images with same scale, length

worst perception â€“ volume, colour, angles, area

Contingency tables â€“ shows how individuals are distributed along each variable, depending on

(contingent on), the value of the other variable

- marginal distribution â€“ in a contingency table, the distribution of either variable alone.

The counts or percentages are the totals found n the margins (usually the right-most column

or bottom row) of the table.

- each cell â€“ gives the count for a combination of values of the two variables

- total percentage, row percentage, column percentage

Conditional distribution â€“ shows the distribution of one variable for just those cases that satisfy

a condition on another

Independent variable â€“ when the distribution of one variable is the same for all categories of

another, in a contingency table (no association between the variables)

Segmented Bar Charts â€“ treats each bar as the â€œwhole and divides it proportionally into

segments corresponding to the percentage in each group

