STA215H5 Lecture Notes - Lecture 2: Categorical Variable, Frequency Distribution, Pie Chart
STA215; Chapter 2
Types of Data;
●Quantitative data - real numbers; for example temperature, miles per hour, sprint
time, etc.. We also refer to this type of data as interval or numerical data
○ All calculations are permitted. We often find averages using interval data.
(mean)
●Nominal data - categories; for example marital status, gender, and political view.
Nominal data are also called qualitative or categorical.
○ Since we are dealing with categories, it does not make sense to perform
calculations on this type of data.
●Ordinal data - appears to be nominal, but the difference is that the order of their
values has meaning; example, the star rating system on review websites.
○ Only type of calculations are those involving a ranking process. (ie. ordering
from least to greatest, highest to lowest, etc.)
Categorical data;
● The distribution of a variable tells us what values it takes and how often it takes these
values.
● This can be done using a frequency table
● The values of the categorical variable are labels for its categories.
● The distribution lists the categories and gives either the count or the percent of
individuals that fall in each category
● Example of frequency table;
○
○ a frequency distribution is a tabular summary of data showing the number
(frequency) of items in each of several non-overlapping classes
■ Need to be distinct groups to have a separate count for each
● We can create two types of distribution tables from a frequency table:
○ Relative Frequency Distribution
■ A relative frequency distribution gives a tabular summary of data
showing the relative frequency for each class.
■ Relative frequency of a class = F n where n represents the total
number of observations and F is the frequency of the class.
■ Count out of total
○Percent Frequency Distribution
■ A percent frequency distribution summarizes the percent frequency of
the data for each class
■ Percent out of 100
● How do we display categorical data?
○ A bar graph: On one axis of the graph, we specify the labels that are used for
the classes (categories). A frequency, relative frequency, or percent
frequency scale can be used for the other axis of the graph.
Document Summary
Quantitative data - real numbers; for example temperature, miles per hour, sprint. We often find averages using interval data. time, etc we also refer to this type of data as interval or numerical data (mean) Nominal data - categories; for example marital status, gender, and political view. Since we are dealing with categories, it does not make sense to perform. Nominal data are also called qualitative or categorical. calculations on this type of data. Ordinal data - appears to be nominal, but the difference is that the order of their values has meaning; example, the star rating system on review websites. Only type of calculations are those involving a ranking process. (ie. ordering from least to greatest, highest to lowest, etc. ) The distribution of a variable tells us what values it takes and how often it takes these values. This can be done using a frequency table. The values of the categorical variable are labels for its categories.