Session 1 (June 23)
Chapter 1 – Introduction and Data
Collection
I) What is Data?
Data
Consist of information coming from
observations, counts, measurements, or
responses.
• “People who eat three daily servings of whole
grains have been shown to reduce their risk
of…stroke by 37%.” (Source: Whole Grains
Council)
• “Seventy percent of the 1500 U.S. spinal cord
injuries to minors result from vehicle accidents,
and 68 percent were not wearing a seatbelt.”
(Source: UPI)
II) What is Statistics?
Statistics
The science of collecting, organizing,
analyzing, and interpreting data in order to
make decisions.
Summer 2011 Pa1e# III) Data Sets
a) Population
The collection of all outcomes, responses,
measurements, or counts that are of interest.
Examples: 1) All the registered voters in
Canada;
2) All students registered in
CQMS102 at Ryerson University;
3) All people in Ontario who have
OHIP.
b) Sample
A subset of the population.
Summer 2011 Pag2# Examples: 1) All the registered voters in
Ontario;
2) All students in this class;
3) All people in Ryerson who have
OHIP.
Example: Identifying Data Sets
In a recent survey, 1708 adults in the United
States were asked if they think global warming
is a problem that requires immediate
government action. Nine hundred thirty-nine of
the adults said yes. Identify the population
and the sample. Describe the data set.
(Adapted from: PewResearch Center)
• The population consists of the responses of
all adults in the U.S.
• The sample consists of the responses of the
1708 adults in the U.S. in the survey.
Parameter and Statistic
c) Parameter
A number that describes a population
characteristic.
Average age of all people in the United States.
Summer 2011 Pag3# d) Statistic
A number that describes a sample
characteristic.
Average age of people from a sample of three
states
Example: Distinguish Parameter and
Statistic
Decide whether the numerical value describes
a population parameter or a sample statistic.
1. A recent survey of a sample of MBAs
reported that the average salary for an MBA is
more than $82,000. (Source:The Wall Street
Journal)
Solution:
Sample statistic (the average of $82,000 is
based on a subset of the population)
2. Starting salaries for the 667 MBA graduates
from the University of Chicago Graduate School
of Business increased 8.5% from the previous
year.
Solution:
Population parameter (the percent increase of
8.5% is based on all 667 graduates’ starting
salaries)
IV) Branches of Statistics
Summer 2011 Pag4# 1) Descriptive Statistics
Involves organizing, summarizing and
displaying data.
2) Inferential Statistics
Involves using sample data to draw
conclusions about a population.
The course CQMS102 will discuss
Descriptive Statistics, while CQMS202 will
discuss Inferential Statistics.
Summer 2011 Pa5e# Example: Descriptive and Inferential
Statistics
Decide which part of the study represents the
descriptive branch of statistics. What
conclusions might be drawn from the study
using inferential statistics?
A large sample of men, aged 48, was studied
for 18 years. For unmarried men,
approximately 70% were alive at age 65. For
married men, 90% were alive at age 65.
(Source: The Journal of Family Issues)
Solution: Descriptive statistics involves
statements such as “For unmarried men,
approximately 70% were alive at age 65” and
“For married men, 90% were alive at 65.”
A possible inference drawn from the study is
that being married is associated with a longer
life for men.
Summer 2011 Pag6# V) Data Collection
Observational study
•A researcher observes and measures
characteristics of interest of part of a
population.
• Researchers observed and recorded the
mouthing behavior on nonfood objects of
children up to three years old. (Source:
Pediatric Magazine)
Experiment
•A treatment is applied to part of a population
and responses are observed.
• An experiment was performed in which
diabetics took cinnamon extract daily while a
control group took none. After 40 days, the
diabetics who had the cinnamon reduced their
risk of heart disease while the control group
experienced no change. (Source: Diabetes
Care)
Simulation
•Uses a mathematical or physical model to
reproduce the conditions of a situation or
process.
• Often involves the use of computers.
Summer 2011 Pag7# • Automobile manufacturers use simulations
with dummies to study the effects of crashes
on humans.
Survey
• An investigation of one or more
characteristics of a population.
• Commonly done by interview, mail, or
telephone.
• A survey is conducted on a sample of female
physicians to determine whether the primary
reason for their career choice is financial
stability.
Consider the following statistical studies. Which
method of data collection would you use to
collect data for each study?
1. A study of the effect of changing flight
patterns on the number of airplane accidents.
Solution: Simulation (It is impractical to create
this situation)
2. A study of the effect of eating oatmeal on
lowering blood pressure.
Solution: Experiment (Measure the effect of a
treatment – eating oatmeal)
3. A study of U.S. residents’ approval rating of
the U.S. president
Summer 2011 Pa8e# Solution: Survey (Ask “Do you approve of the
way the president is president handling his
job?”)
VI) Data Classification
Types of Data
1) Qualitative Data (Categorical Data)
consists of attributes, labels, or non-numerical
entries.
Another example: The responses to some
questions such as
1) Are you currently looking for a job? (Yes
or No)
2) Which day of the week do you have a
class in Ryerson? (M, T, W, TR, F)
3) Where are you living?
Summer 2011 Pag9# 4) What’s the postal code of the address
where you are living?
2) Quantitative data (Numerical Data)
Numerical measurements or counts.
Discrete: From a counting process.
Continuous: From a measuring process.
Another example: The responses to some
questions such as
Discrete:
How many courses are you taking this term?
(1, 2, 3, etc. )
How old are you? (5, 20, 30, 45, etc. )
Continuous:
How tall are you?
How much time do you spend on shopping
each week?
Summer 2011 Pag10 Levels of Measurement
Data can be classified according to the type of
measurement scale that is i

