Chapter 1-2

SPAN 101 Chapter Notes - Chapter 1-2: Randomness, Data Mining, Mutual Fund

Course Code
SPAN 101
Enrique Manchon

Estimation: how long would it take to move a mountain if you wanted to develop the
land into a shopping mall? How many circus clowns can you fit in a sports car?
Percentages and averages: if your stock portfolio drops by 50% but rebounds by
60%, what is the combined effect on the portfolio’s worth? If you have one foot in
the oven and one in the freezer, are you at room temperature on average?
Randomness: when is a cluster of events simply due to chance and when is the
symptomatic of some real effect?
Uncertainty: if the meteorologist predicts a 60% chance of rain today, should the
road paving crew cancel the day’s paving work?
Variation: we can never repeat things exactly. Everything varies but we are not sure
how things vary
Rows/records/case: individual cases about whom we record some characteristics
Respondents: individuals who answer a survey
Subjects: people whom we experiment on
Observations: data values
Variables (columns): characteristics recorded about each individual or case
The high of students in a university class
Data: specific values of variables
Once you measure each student and obtain actual values of height for each
student, you have data
Relational database: two or more separate data tables are linked together so that
information can be merged across them
A table of customers, along with demographic information on each, is such a
Categorical Variable: when a variable names categories and answers questions
about how cases fall into those categories
Descriptive responses. what type of mutual fund do you invest in
Phone numbers, postal codes
o Nominal variables: used to make categories
o Ordinal variables: natural ordering
