Chapter 10

Chapter 10
Descriptive Statistics
Statistics has developed a variety of tools for collecting and understanding data
So far we have discussed ways to collect data
So how do we deal with the data we collect?
- Begin by summarizing the data
- Want to describe or summarize in a clear and concise way
- Will first focus on descriptive statistics (graphical and numeric)
Recall…
Interested in something about a population
Population is a collection of individuals
Describe individuals with data
Data sets contain information/facts relating to individuals
Variables are attributes of an individual
- E.g., hair color, pain severity…
Distribution of a variable tell us what value a variable can take and how often it will take
these values
Types of Variables
Two main types of variables:
1. Quantitative Variables
- Take on numeric values for which addition and averaging make sense (height,
weight, income…)
2. Categorical Variables
- Each individual falls into a category. Ethnicity, male/female, success/fail
- Ordinal data is a special type of categorical data where categories have ordered in
a natural way. Rate your preference for this course from 1(dislike) to 5 (enjoy greatly)
Descriptive Statistics
Numerical: Summary Tables
Graphical: Bar Graphs, Pie Charts
Graphical Descriptions of Data
Pictures (graphics) can be a powerful tool for summarizing data.
A graph (or graphic) is any visual display of numbers
Data visualization is still an emerging field
Many different types of plots are used
The goal of a graph is to
- Summarize information from a data set into a picture that is easy to understand, but
accurate
- Often used to highlight a specific feature of the data Recall…
Data are values of variables that we observe in a sample
Sample was drawn from a population
We are trying to find out about something about the values of the variable in the
population. We want to make an inference about the population with the data.
Data Distributions
Distribution of a variable gives the values the variable can take and how often it takes
on each value
- Population distribution is a distribution for a population of values. Also called a
probability distribution.
- An empirical distribution is a distribution of a sample. Our grades
- We use summaries of an empirical distribution to learn about a population
distribution
Data Distributions
For categorical data, we can summarize the distribution easily:
1. Identify all of the values the variable can take
2. Count the number of times each value is observed
