Lecture 6

Lecture 6
Chebyshev’s Theorem
Chebyshev’s Theorem is an empirical rule that applies to all distributions, not only the
Normal distribution
Regardless of the shape of a population’s frequency distribution, the proportion of
observations falling within k standard deviations of the mean is at least (given that k
equals 1 or more);
1
±kσof μ≥1− 2
Proportion falling within k
Example (Chebyshev’s Theorem)
At least 75% of all observations lie within k= 2 standard deviations of the mean
Coefficient of Variation
Used to compare degrees of dispersion among data sets
Ratio of the standard deviation to the arithmetic mean
In R use sd()/ mean()
CHAPTER III
Origins of Data
Data can originate in a number of ways
Internal Data
Created as by-products of regular activities
Example (Internal Data)
Customer, employee, production records; government records
External Data (typical source for this course)
Created by entities other than the person, firm, or government that wants to use
the data
Example (External Data)
Print sources, CD-ROMs, web sites
In the past, users were limited to print data (e.g., print and CDROM) Frequently expensive
Useful for transferring to computer applications
The Internet is an exceptionally efficient source for data
Sampling Versus Census Taking
A census is a complete survey of every member in the population
A sample is a partial survey in which data is collected for only a subset of the
population
Reasons for sampling versus conducting a census
Expense can be prohibitive
Speed of response
Impossible. For example, we may have an infinite population (i.e., observations
occur from an infinite or recurring process)
Destructive sampling (e.g., lifetime of light bulb, safety of automobile [destroys all
output])
Accuracy can, oddly enough, be better (e.g., higher quality information results via
hiring fewer persons who can be better trained, say)
Types of Samples: Nonprobability
Nonprobability Sample: occurs when a sample is taken from an existing population in a
haphazard fashion without the use of some randomizing device assigning each member
a known (positive) probability of selection
Voluntary response sample (e.g., phone survey [self-selection issue])
Convenience sample (e.

More
Less
