STAC50H3 Lecture Notes - Exploratory Data Analysis
Document Summary
A statistical analysis starts with a set of data. We construct a set of data by first deciding what cases or units we want to study. For each case, we record information about characteristics that we call variables. Cases are the objects described by a set of data. Cases may be customers, companies, subject in a study, or other objects. A label is a special variable used in some data sets to distinguish the different cases. A variable is a characteristic of a case. Different cases can have different values for the variables. A categorical variable places a case into one or several groups or categories. A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense. The distribution of a variable tells us what values it takes and how often it takes these values. Exploratory data analysis: statistical tools and ideas help us examine data in order to describe their main features.