Chapter 1- Looking at Data Distributions
A statistical analysis starts with a set of data. We construct a set of data by first deciding what cases or
units we want to study. For each case, we record information about characteristics that we call
Cases are the objects described by a set of data. Cases may be customers, companies, subject in a study,
or other objects.
A label is a special variable used in some data sets to distinguish the different cases.
A variable is a characteristic of a case.
Different cases can have different values for the variables.
Categorical and Quantitative Variables
A categorical variable places a case into one or several groups or categories.
A quantitative variable takes numerical values for which arithmetic operations such as adding and
averaging make sense.
The distribution of a variable tells us what values it takes and how often it takes these values.
1.1Displaying Distributions with Graphs
Exploratory data analysis: statistical tools and ideas help us examine data in order to describe their
Two strategies that help us organize our explorati