false

Study Guides
(248,398)

Canada
(121,510)

York University
(10,209)

Administrative Studies
(1,427)

ADMS 2320
(97)

Ying Kong
(3)

Final

Unlock Document

Administrative Studies

ADMS 2320

Ying Kong

Fall

Description

Chapter 1: What is Statistic?
• Statistics is a way to get information from data. Data Statistics Information.
• A Population is the group of all items of interest to a statistics practitioner. E.g. All York students.
• Parameter – A descriptive measure of a population (mean, proportion, etc.).
• A Sample is a set of data drawn from population. E.g. a sample of 500 students completes a survey.
• Statistic – A descriptive measure of a sample (mean, proportion, etc.).
• Because populations are very large, expensive, and very time consuming, we use statistics to make
inferences about parameters, i.e., we can make an estimate prediction, or decision about a population
based on sample data.
• Descriptive Statistics are methods of organizing, summarizing, and presenting data in a convenient
and informative way. These methods include: Graphical Techniques and Numerical Techniques.
• Inferential Statistics is set of methods used to draw conclusions or inferences about characteristics of
populations based on data from a sample.
• Statistical Inference is the process of making an estimate, prediction, or decision about a population
based on a sample.
• Conclusions and estimates are not always going to be correct, so we build into statistical inference a
measure of reliability: the confidence level and the significance level.
• Confidence Level (1-α) is the proportion of times that an estimating procedure will be correct.
• Significance Level (α) measures how frequently the conclusion will be wrong in the long run.
• E.g. the poll is considered accurate within 3.4 percentage points, 19 times out of 20. Our Confidential
level is 95% (19/20 =0.95), while our significance level is 5%.
Chapter 2: Graphical and Tabular Descriptive Techniques
• Descriptive Statistics involves arranging, summarizing, and presenting a set of data in such a way
that useful information is produced (Graphical Techniques and Numerical Techniques).
• A Variable is some characteristic of a population or sample. E.g. Student grades ( X, Y, Z … )
• The Values of the variable are the possible values for a variable. E.g. Student marks (0 - 100)
• Data are the observed values of a variable. E.g. Student mark: 75, 50, 92, 82, 45, 66 …
• There are three types of data: Interval, Nominal, and Ordinal data.
• Interval Data are real numbers (Quantitative or numerical), such as heights, incomes, distance …
• The values of Nominal Data are categories (Qualitative or Categorical), e.g. sex: female = 1 male = 2
• Ordinal Data appear to be categorical in nature, but their values have an order (Ranking), e.g. poor =
1, fair good = 2, good = 3, very good = 4, excellent = 3, etc.
• Data Not Categorical Interval ( real number; all calculation; treated as ordinal or nominal)
Categorical Order Ordinal (ranked order; Calculation only based on ordering process;
May be treated as nominal but not interval)
No Order Nominal (arbitrary numbers represent categories, Only calculate
based on frequency; May not be treated as nominal or not interval
• Single set of Nominal Data: Use Frequency distribution and Relative Frequency Tables, Bar and Pie
Charts. E.g. Subject Frequency Relative Frequency 120 100 60,
Accounting 100 50% y100 30%
e80 60 100,
Finance 40 20% e60 40 50%
Management 60 30% F20
0 Accounting
Total 200 100% AccountinFinanceManagement 40, Finance
Subject 20% Management • Histogram is most important graphical method for single set of Interval Data; it doesn’t only
summarize interval data but also help explain probabilities.
o Number of classes intervals = 1 + 3.3 log (n)
o Class width = (Largest observation + Smallest observation) / Number of Classes.
Shapes of Histograms:
• Symmetry: A histogram A histogram is said to be Skewness: A skewed histogram is one
symmetric if, when we draw a vertical line down the with a long tail extending to either the
right or the left.
center of the histogram, the two sides are identical in
shape and size:
• Modality: A unimodal histogram is one with a single Bell Shape: A special of symmetric
peak, while a bimodal histogram is one with two peaks. unimodal histogram.
c1.00 1.00
• Ogive is a graph of a cumulative relative frequency distribution. u0.80 0.85
r
v0.60
Relative Cumulative l 0.48
Class Limits Frequency Frequency Relative Frequency R0.40 0.30
t0.20
0-1500 60 60/200 = 0.30 60/200 = 0.30 u
u0.00 0
1500-3000 35 35/200 = 0.18 0.30+0.18 = 0.48 C 0 1500 3000 4500 6000
3000-4500 75 75/200 = 0.38 0.48+0.38 = 0.85
We estimate that about 48% of the students have first
4500-6000 30 30/200 = 0.15 0.85+0.15 = 1.00
Total 200 income lower than $3000. ….85%...... $4500
• Two Nominal Variable Contingency Table and Bar Chart (two-dimensional)
• Two Interval Variable Scatter diagram to explore the relationship between 2 interval data.
The Independent variable ( X ) Horizontal axis. The Dependent variable ( Y ) Vertical axis.
• Linearity (Linear Relationship) Positive, Negative, or weak or non-linear relationship.
Price House Price Car Income Income
Size # of miles Height
800
700
• Cross-Sectional Data Observations measured at the same 600
500
point in time. 400 IncomeTax
300
200
100
2000 2002 2004 2006 2008 • Time-Series Data Observations measured at successive points in time Line chart Chapter 3: Art and Science of Graphical Presentations
• Graphical Excellence is achieved when
1. Large data sets are presented concisely and coherently. If small Table; 1 or 2 Sentence
2. The message being presented by the graph is clear to readers. I.e. chart to replace 1000 words
3. The comparison of two or more variables is aided. I.e. one-variable graph little information
4. The substance of the data, not the form of the graph is important.
5. There is no distortion and deception of the data and findings.
• It’s important to be able to critically evaluate the graphically presented information because graphical
techniques create a visual impression which is easy to distort.
• Be wary of graphs without a scale on one axis; avoid being influence by caption.
○ Understand the information being presented. Focus on the numerical values the graph represents:
absolute values or relative values (e.g. percentages, deltas)?
• Are the horizontal or vertical axes distorted in any way?
○ Note the scales on both axes
○ Graphs with unmarked axes
○ The gaps along axes, the size of bar or chart
• Writing a Report:
1. State your objective or purpose of the statistical analysis.
2. Describe the Experiment – ensure the proper conduct of experiment.
3. Results - describe using words, tables, and charts – be honest; don’t mislead reader
4. Discussion of limitations of statistical techniques.
5. (Discuss problems with the analysis – include violations.
• Oral Presentation:
1. Know your audience: What kind of information they will be expecting? What is their level of
statistical knowledge?
2. Restrict your points to the main study objectives: Don’t go into the details of your analysis
3. Stay within time limits: Respect your audience
4. Use graphs Use the graphical excellence ideas here to explain complex ideas
5. Provide handouts. Chapter 4: Numerical Descriptive Techniques
Measures of Central Location
Population Sample Example Note
Size N n N = 6: {1 5 7 8 2 7 }
x x 1+5+7+8+2+7 - Seriously affected by extreme value –
Arithmetic μ = ∑ x = ∑ x = =5 outliers. Billionaire average income
Mean N n 6 - Valid: Only Interval data
- Sort in order - Half of observation is smaller than 6
1 2 5 7 7 8 and half is greater than 6
Median - Middle number or Median = ( 5 + 7 ) / 2 = 6 - Valid: Ordinal or Interval (extreme)
- average of middle two number
- A set of observation occur 7 occurs twice - mainly useful for nominal data
Mode most frequently. - large data set is more relevant
- May have one, two, or more Mode = 7 - All occurs once: 0 or All
R = (1+ R )(1+ R )...(1+ R ) Use when the variable is a growth rate or rate of change.
Geometric g 1 2 st n nd st
Mean E.g. 2 year investment: 1 year: 100% growth; 2 year: 50% loses from 1 year
Rg= (1+1)(1− 0.5) −1 =1+1 = 0
∑ x − x 4+0+ 2+3+3+ 2
MAD Mean Absolute Deviation : MAD = = = 2.33
n 6
If a distribution is symmetrical, If the distribution is non symmetrical, and skewed to
the mean, median and mod may coincide the left or the right, the three measure differ.
1
k Interval Chebysheff Theorem 1− 2 Empirical Rule (Bell shape)
k
1 x −1s, x +1s At least 0% (1 – 1 / 1 ) = 0 Approximately 68%
2
2 x −2s, x + 2s At least 75% (1 – 1 / 2 ) = 0.75 Approximately 95%
3 x −3s, x +3s At least 88.9% (1 – 1 / 3 ) = 0.889 Approximately 99.7%
E.g. Histogram is Bell Mean = 10% , Standard Deviation = 8%, According to Empirical Rules:
- Approximately 68% is between 2 and 18 …. App. 99.7% is between -14 and 34. (10 – 3x8), (10+3x8)
E.g. Positive Skewed Histogram of salaries, Mean = $280, Standard Deviation = $30, according to Chebysheff:
- At least 75% of the salaries lie between $220 and $340 …. At least 88.9% is between $190 and $370 Measures of Variability
Example: - Advantage is simplicity
Range Largest observation – Smallest observation 6: {1 5 7 8 2 7 } Disadvantage: two
Range: 8 – 1 = 7 completely different dataset
may have the same range.
(x − μ) 2 n X x2
Population:σ = ∑ i 1 1 1 1 (30)2
N 2 5 25 s = 192−
2 3 7 49 6−1 6
2 ∑ (xi− x) 4 8 64 2
Variance Sample: s = 5 2 4 1 (30)
n −1 = 192−
Shortcut of Sample Variance: 6 7 49 5 6
2 30 192 1
s = 1 x − (∑ xi) = 192−150 =8.4
n −1 ∑ i n 5
==>
Population:σ = σ 2 s = 8.4 = 2.9
Standard When we compare two standard deviations with
Deviation similar means, smaller number means more
Sample: 2 consistent, stable, etc. while larger standard
s = s deviation show higher risks, less consistency.
σ Mean = 5, Standard deviation = 2.9
Population:CV = μ 2.9
Coefficient ==> cv = =.58
of Variation s 5
Sample: cv = This coefficient provides a proportionate measure of
x variatioSmaller consistency; largerinconsistency
• If data are symmetric, with no serious outliers, use range and standard deviation.
• If comparing variation across two data sets, use coefficient of variation.
• Measures of Relative Standing are designed to provide information about the position of particular
values relative to the entire data set. The Percentile (P ) is the value for which P percent are less than
th
that value and (100-P)% is greater than that value. E.g. you score in the 60 percentile means 60% of
other score were below you, and 40% of scores were above you.
• First (lower) decile = 10th percentile
• First (lower) quartile, Q1, = 25th percentile
• Second (middle) quartile,Q2,= 50th percentile = Median
• Third quartile, Q3, = 75th percentile
• Ninth (upper) decile = 90th percentile
Interquartile range = Q 3 Q 1 (measure the spread of the middle 50% of observation)
P
Location of a Percentile:L p (n +1) , where Lpis the location of the Pth percentile
100
E.g. {0 1 2 3 4 5 6 7 8 975 L = (10+1) (75/100) = 8.25 The third quartiles or 75 percentile = 7 + (8-7) 0.25 = 7.25 Measures of Linear Relationship provide information as to the strength and direction of a linear
relationship between two variables (if one exists).
Popular: E.g. x is hours of study and y is grade
∑ (x i μ )xy −iμ ) y 2 2
σ xy = n x y x y xy
N 1 0 40 0 1600 0
Sample: 2 5 55 25 3025 275
3 8 65 64 4225 520
Covariance ∑ (xi− x)(y i y)
sxy 4 10 70 100 4900 700
n−1 5 15 85 225 7225 1275
Shortcut for Sample Covariance: 6 20 100 400 10000 2000
∑
1 ∑ ∑ xi yi 58 415 814 30975 4770
s xy ∑ xiy i 1 58×415
n−1 n sxy = 4770 − = =151.6667
6−1 6
σ 2
ρ = xy −1≤ ρ ≤ +1 2 1 2 ∑ xi 1 (58)
Popular: σ σ x = n −1 xi− n = 6 −1814 − 6 = 50.6667
x y
1 ( y)2 1 (415)
s xy y = ∑ yi− ∑ i = 30975− = 454.1667
Sample: r = −1≤ r ≤ +1 n −1 n 6−1 6
sx y
Coefficent of sx= s = x0.6667 = 7.1181
Correlation r = + 1 Strong positive linear relationship
r = 0 No linear relationship 2
sy= s = y54.1667 = 21.3112
r = – 1Strong negative linear relationship
r = 0.56 Moderately strong positive sxy 151.6667
r = – 0.1 Weak negative ….. r = = = 0.9998
sx y 7.1181×21.3112 Chapter 6: Probability
• A Random Experiment is an action or process that leads to one of several possible outcomes.
The listed outcomes must be exhaustive (All possible outcomes included), and the outcomes must
be mutually exclusive (No two outcomes can occur at the same time.)
• Sample Space of a random experiment is a list of exhaustive and mutually exclusive outcomes.
The probability of any outcome is between 0 and 1, and the sum of the probabilities of all the
outcomes equals 1. S = {O ,1O , 2, O } k 0 ≤ P(O) ≤i1 ∑ P(O )i=1
• There are three ways to assign a probability to an outcome:
o Classical Approach: counting approach used effectively in games of chance.
o Relative Frequency: assigning probabilities based on experimentation or historical data.
o Subjective Approach: assigning probabilities based on the assignor’s judgment.
• An Event is a collection or set of one or more simple events in a sample space. A Simple Event
is individual outcome of a sample space.
• The Probability of an Event is the sum of the probabilities of the simple events that constitute
the event.
• The Complement of Event A is defined to be the event consisting of all sample points that are
“not in A” or A . P(A) + P(A ) = 1 c
• The Intersection of Event (A and B) A and B is the set of all sample points that are in both A
and B. The Probability of the intersection is called the Joint Probability: P (A and B).
• When two events are Mutually Exclusive (that is the two events cannot occur together), their
joint probability is 0.
• Marginal Probabilities are computed by adding across rows and down columns.
B B P(A)
1 2 i
A 2 .20 .35 .55
Joint Marginal
Probabilities A 2 .15 .30 .45 Probabilities
P(B)i .35 .65 1.00
• The Conditional Probability is used to determine how two events are related. We can determine
P(A and B)
the probability of event A given the occurrence of event B: P(A| B) =
P(B)
• If two events A and B are Independent – P(A) is not affected by P(B), then P(A|B) = P(A)
• The Union of events A and B is the event that occurs when either A or B or both occur.
• Complement Rule: P(A ) = 1 – P(A)
P(A and B)
• Conditional Probability: P(A| B) =
P(B)
• Multiplication Rule: P(A and B) = P(A|B) P(B) or P(A and B) = P(B|A) P(A)
• Multiplication Rule for Independent Events: P(A and B) = P(A) P(B)
• Addition Rule: P(A or B) = P(A) + P(B) – P(A and B)
• Addition Rule for Mutually Exclusive Events: P (A or B) = P(A) + P(B)
Joint eventProbabilities
B|A 0.75 A and B (0.90)(0.75) = 0.675
A 0.90 C|A 0.20 A and C (0.90)(0.20) = 0.180
D|A 0.05 A and D (0.90)(0.05) = 0.045
• Probability B|A0.60 A and B (0.10)(0.60) = 0.060
C C C
A 0.10 C|A0.30 A and C (0.10)(0.30) = 0.030
D|A0.10 A and D (0.10)(0.10) = 0.010 Chapter 7: Random Variables and Discrete Probability Distributions
• A Random Variable is a function or rule that assigns a number to each outcome of an
experiment. There are two types of random variables: Discrete Random Variable (Integer or
countable, 1,2,3…) and Continuous Random Variable (Real number or uncountable, time,
distance, etc. )
• A Probability Distribution is a table, formula, or graph that describes the values of a random
variable and the probability associated with these values. P(x) or P(X=x)
• Requirement for a Distribution of a Discrete Random Variable: 0 ≤ P(x) ≤ 1 and
P(x) =1
∑
• Population Mean is the weighted average of all of its values. The weights are the probabilities.
This parameter is also call Expected Value of X: E(X) = μ = ∑ xP(x)
2 2 2
• Population Variance: V(X) = σ =2 ∑ (x − μ) P(x) = ∑ x P(x) − μ (Shortcut)
• Population Standard Deviation: σ = σ 2
E.g # of Course # of Student X P(X) X P(x) X P(x) X2 X P(x)
0 5 0 0.05 0 0.05 0.00 0 0.00
1 10 1 0.10 1 0.10 0.10 1 0.10
2 20 2 0.20 2 0.20 0.40 4 0.80
3 40 3 0.40 3 0.40 1.20 9 3.60
4 15 4 0.15 4 0.15 0.60 16 2.40
5 10 5 0.10 5 0.10 0.50 25 2.50
2.80 9.40
E(X) = μ = ∑ xP(x) = 2.8
2 2
V(X) = σ =2 ∑ x P(x) − μ = 9.4 – (2.8) = 1.56
σ = σ = 1.56 =1.249
• Laws Expected Value: • Laws Variance:
− E(c) = c − V(c) = 0
− E(X + c) = E(X) + c − V(X + c) = V(X)
− E(cX) = c E(X) − V(cX) = c V(X)
E.g. Monthly sales have a mean of is $25,000 and standard deviation of $4,000, profits 30% of sales
subtract $6,000.
Mean of Monthly Profits:
E(Profit) = E[.30 (Sales) – 6000] = E[.30 (25000)] – 6000 = .30 E(25000) – 6000 = $1,500
Standard Deviation of Monthly Profits:
V (Profit) = V[.30 (Sales) – 6000] = 0.30 2 V(Sales) = 0.09 X 4000 = 1440000
Standard Deviation of Monthly Profits: σ = σ = 1440000 =1200 • Binomial Distribution is the probability distribution that results from doing a binomial
experiment. Binomial Experiments have the following properties:
1. Fixed number trials, represented by n.
2. Each trial has two possible outcomes, a success and a failure.
3. P(success) = p P(failure) = 1 – p
4. The trials are independent, which means that the outcome of one trial does not affect the
outcomes of any other trials.
n!
• Binomial Probability Distribution: P(x) = p (1− p) n−x
x!(n− x)
• The Binomial Table gives cumulative probabilities for P(X ≤ k)
P(X = k) = P(X ≤ k) – P(X ≤ [k – 1])
E.g. P(X = 5) = P(X ≤ 5) – P(X ≤ 4)
P(X < 5) = P(X ≤ 4)
P( 3 ≤ X ≤ 6) = P(X ≤ 6) – P(X ≤ 2)
P(X > 5) = P(X ≥ 6) = 1 – P(X ≤ 5)
• Binomial Random Variable:
o Mean or Expected Value: μ = np
o Variance: σ = np (1-p)
o Standard Deviation: σ = np(1− p) Chapter 8: Continuous Probability Distribution
• A Continuous Random Variable is one that can assume an uncountable number of values. So,
we cannot list the possible value because infinity number, and the probability of each individual
values is 0, i.e., Point Probability is Zero, e.g. P(X = 5) = 0
• A function f(x) is called Probability Density Function (over range a ≤ x ≤ b) if it meets the
following requirements:
o F(x) ≥ 0 for all x between a and b
o The Total area under the curve between a and b is 1
The Uniform Probability Distribution (the Rectangular Probability Distribution) is described by
1
the function: f (x) = where a ≤ x ≤ b
b −a 1
1 b − a
P(x 1 x ≤ x )2= width X height = P(x 1 x ≤ x )2= (x −2x )1 x x
b−a 1 2
• The Normal Distribution is the most important of all probability distributions because its
crucial role in statistical inference. It look like Bell Shaped, and Symmetrical around the mean.
• The Probability desity function of a Normal Random Variable is:
− (x−μ)2
f (x) = 1 e 2 σ −∞ < x < ∞ , e = 2.71828 π = 3.14159
σ 2π
Normal Distribution is fully defined by standard deviation (σ) and Mean (μ).
Unlike the range of the uniform distribution (a ≤ x ≤ b), Normal Distribution range from minus
infinity to plus infinity (∞ < x < ∞ )
• A normal distribution whose mean is zero (μ=0) and standard deviation is one (σ=1) is called the
Standard Normal Distribution. Any normal distribution can be converted to a standard normal
distribution with simple algebra which makes calculations much easier.
• Normal Distribution with same standard deviation, increasing mean shift the curve to the
right.
Normal Distribution with same means, increasing standard deviation flattens the curve.
• We can calculate Normal Probabilities by converting normal random variable to a Standard
X − μ σ = 10
Normal Random Variable: Z = 9 4
σ . .
E.g. μ = 50, σ = 10 μ
X = 45 50 60
P(45 < X < 60) = P( 45−50 < X − μ < 60−40 ) = P(−.5< Z <1) Z = -.5 0 1
10 σ 10
σ = 10
P(−.5 < Z <1) =.1915+.3413=.5328
P(Z < −.5) = P(Z >.5) = 0.5−.1915=.3085 μ
X = 45 50 60
P(Z >1) = 0.5−.3413=.1587 Z = -.5 0 1
P(55< X < 60) = P(.5 < Z <1) =.3413−.1915 =.1498
σ = 10
• Finding Value of Z: often we’re asked to find some value of Z for
μ
a given probability, i.e., given an area (A) under the curve, X = 45 50 55 60
what is the corresponding value of Z (Z ).A Z = -.5 0 .5 1
We can find Z by “Reverse look up” on table for .475 and we find:
A
Z .025 1.96 P(-1.96 < Z < 1.96) = .95
Z .05.645 P(-1.645 < Z < 1.645) = .90 Area= .025
Area= .5-.025 = .475 ZA Z.012.33 P(-2.33 < Z < 2.33) = .98
Chapter 5: Data Collection and Sampling
Three Methods of Collection Data
• Direct Observation – measurements representing a variable of interest are observed and
recorded, without controlling any factor that might influence their values. It is inexpensive but
difficult to product useful information because result can be affect by other hidden factors. E.g.
Aspirin Heart Attacks (Health conscious)
• Experiments – measurements representing a variable of interest are observed and recorded,
while controlling factors that might influence their values, usually more expensive and direct
observation. E.g. Regular Aspirin and Not Regular Aspirin Heart Attacks.
• Surveys – solicit or ask for information from people. E.g. Gallup polls and Harris Survey, pre-
election polls. Majority of surveys are conducted for private use. E.g. Marketing surveys.
• Key Survey parameter is the Response Rate (i.e. the proportion of all people selected who
complete the survey)
• Surveys may be administered in a variety of ways
○ Personal Interview – High expected response rate, lower misunderstanding of question,
but expensive and easy to be bias if interviewer not train properly.
○ Telephone Interview – Less expensive, But lower response rate
○ Self Administered Questionnaire – By mail, inexpensive, But lower response rate and
high number of incorrect response (misunderstanding of question)
• Key Questionnaire Design Principles
o Keep the questionnaire as short as possible.
o Ask short, simple, and clearly worded questions.
o Demographic Questions: 2 schools of thought
Keller recommends: Start with demographic questions to help respondents get
started comfortably
Other experts: recommend these questions be asked at the end).
o Use dichotomous (yes|no) & multiple choice questions
o Use open-ended questions cautiously
o Avoid using leading-questions
o Pretest a questionnaire on a small number of people.
o Think about the way you intend to use the collected data when preparing the
questionnaire.
• Statistical inference permits us to draw conclusions about a population based on a sample.
• Sampling – selecting a sub-set of a whole population is often done for reasons of:
o cost (it’s less expensive to sample 1,000 TV viewers than 100 million TV viewers)
o practicality (e.g. performing a crash test on every automobile produced is impractical).
• In any case, the sampled population and the target population should be similar to one
another.
• A sampling plan is just a method or procedure for specifying how a sample will be taken from a
population. Our focus will be on these three methods:
o Simple Random Sampling (SRS) is a sample selected in such a way that every possible
sample of the same size is equally likely to be chosen. Use of a random numbers table or a software package
Example: drawing three names from a hat containing all the names of the students
in the class. Any group of three names is as equally likely as picking any other
group of three names.
o Stratified Random Sampling is a stratified random sample is obtained by separating the
population into mutually exclusive sets, or strata, and then drawing simple random samples
from each stratum.
Example: Male students = 2000, female = 4000,
Sample = 300 male =100 (33%) + female = 200 (66%)
o Cluster Sampling is a simple random sample of groups or clusters of elements (vs. a
simple random sample of individual objects).
This method is useful when it is difficult or costly to develop a complete list of
the population members or when the population elements are widely dispersed
geographically.
Two stage process:
• Researcher draws a random sample of clusters
• Researcher draws a random sample of elements within each selected
cluster
Can involve multiple stages, with clusters within clusters
Cluster sampling may increase sampling error due to similarities among cluster
members.
• The larger the sample size is, the more accurate we can expect the sample estimates to be.
• Two major types of error can arise when a sample of observations is taken from a population:
• Sampling error refers to:
o differences between the sample and the population that exist only because of the
observations that happened to be selected for the sample
o the differences in results for different samples (of the same size) is due to sampling error
o Increasing the sample size will reduce this type of error.
o Example:
Two samples of size 10 of 1,000 households
If we happened to get the highest income level data points in our first sample and
all the lowest income levels in the second, this delta is due to sampling error.
• Non-sampling error are more serious
o due to mistakes made in the acquisition of data
o due to the sample observations being selected improperly.
o Increasing the sample size will not reduce this type of error.
• Three types of nonsampling errors:
o Errors in data acquisition arises from the recording of incorrect responses due to:
incorrect measurements being taken because of faulty equipment
mistakes made during transcription from primary sources
inaccurate recording of data due to misinterpretation of terms
inaccurate responses to questions concerning sensitive issues
o Nonresponse errors refers to error (or bias) introduced when responses are not obtained from some members of the sample, i.e. the sample
observations that are collected may not be representative of the target population
Response Rate (i.e. the proportion of all people selected who complete the survey)
is a key survey parameter and helps in the understanding in the validity of the
survey and sources of non-response error.
o Selection bias occurs when the sampling plan is such that some members of the target
population cannot possibly be selected for inclusion in the sample
o Example: 1936 Literary Digest Poll predicted a Republican candidate, Alfred Landon
would defeat Democrat incumbent, Franklin D. Roosevelt, by a 3 to 2 margin. Outcome:
Roosevelt defeated Landon by a support of 62% of electorate.
o 2 mistakes made in sampling procedure
Ten million sample ballots sent out to prospective voters. Most of the names were
taken from the Digest’s subscription list and from telephone directories
(subscribers to magazine & people with telephones tended to be wealthier than
average, and such people then, as today, tended to vote Republican)
Only 2.3 million ballots were returned (resulting in a self-selected sample). Chapter 9: Sampling Distribution
• A sampling distribution is created by, as the name suggests, sampling.
• The method we will employ on the rules of probability and the laws of expected value and
variance to derive the sampling distribution.
• Sampling Distribution of the Sample Mean
E(X) = μ = μ = xP(x)
• x ∑
• V(X) =σ =σ /n 2 σ =σ / n where σ = ∑ x P(x) − μ2
x x
• If X is normal,X is normal. If X is nonnormalX is approximately normal for sufficiently large
sample sizes.
o Note: the definition of “sufficiently large” depends on the extent of nonnormality of X
(e.g. heavily skewed; multimodal)
o If the population is normal, theX is normally distributed for all values of n
o If the population is non-normal, thenX is approximately normal only for larger values
of n
o In many practical situations, a sample size of 30 may be sufficiently large to allow us to
use the normal distribution as an approximation for the sampling distributiXn.of
• The standard deviation of the sampling distribution is called the standard error:
• Remember that μ and σ 2are the parameters of the population of X
• To create the sampling distribution of X, we repeatedly drew samples of size 2 from the
population and calculated for each sample Thus, we treXt as a brand new random variable, with
its own distribution, mean, and variance
• Central Limit Theorem: The sampling distribution of the mean of a random sample drawn from
any population is approximately normal for a sufficiently large sample size. The larger the sample
size, the more closely the sampling distributioX owill resemble a normal distribution.
• Statisticians have shown that the mean of the sampling distribution is always equal to the mean
of the population and that the standard error is eqσ / no for infinitely large populations.
σ N − n
However, if the population is finite the standard errox is
n N −1
where N is the population size and N −n is called the finite population correction factor. If the
N −1
population size is large relative to the sample size the finite population correction factor is close to 1
and can be ignored.
• As a rule of thumb we will treat any population that is at least 20 times larger than the sample
size as large. In practice most applications involve populations that qualify as large. As a
consequence the finite population correction factor is usually omitted.
• The sampling distribution can be used to make inferences about population parameters. In order
to do so, the sample mean can be standardized to the standard normal distribution using the
X − μ
following formulation: Z = σ / n
P(μ − Z σ < X < μ + Z σ ) =1−α
• Another way to state the Probability: α /2 n α /2 n Sample Distribution of a Proportion
• The estimator of a population proportion of successes is the sample proportion. That is,
ˆ X
• we count the number of successes in a sample and compute: P = n
• Using the laws of expected value and variance, we can determine the mean, variance, and
• standard deviation of . (The standard deviation of is called the standard error of the proportion.)
○ E(P) = μ =pp
ˆ 2 p(1− p)
○ V(P) =σ = Pˆ
n
○ σ = p(1− p)/n
Pˆ
• Sample proportions can be standardized to a standard normal distribution using this formulation:
P − p
o Z =
p(1− p)/n
• The final sampling distribution introduced is that of the difference between two sample means.
• This requires: independent random samples be drawn from each of two normal populations
• If this condition is met, then the sampling distribution of the difference between the two sample
means, i.e. X − X will be normally distributed.
1 2
• Note: If the two populations are not both normally distributed, but the sample sizes are “large”
(>30), the distribution of X1− X 2 is approximately normal.
• The expected value and variance of the sampling distribution of are given by:
• Mean: E(X −1X ) 1 μ 1 2x= μ 1 μ 2
2 2
2 σ 1 σ 2
• Variance: V(X −1X ) 1σ 1 −2= + (also called the standard error if the difference
n1 n2
between two means)
σ 1 σ 2
• Standard deviation: σ 1 −2= +
n1 n 2
(X 1 X )−2μ − μ1) 2
Z = 2 2
• We can compute Z (standard normal random variable) in this way: σ1 + σ 2
n n
1 2 Chapter 10: Introduction to Estimation
• Statistical inference is the process by which we acquire information and draw conclusions about
populations from samples. There are two general procedures for making inferences about
population: Estimation and Hypothesis.
• The objective of Estimation is to determine the approximate value of a population parameter on
the basis of a sample statistic. E.g., sample mean ( ) is employed to estimate the population mean
(μ).
• A Point Estimator draws inferences about a population by estimating the value of an unknown
parameter using a single value or point.
Drawbacks: (1) The point probabilities in continuous distributions were virtually zero.
(2) We’d expect that the point estimator gets closer to the parameter value with an increased sample
size, but (3) point estimators don’t reflect the effects of larger sample sizes.
• An Interval Estimator draws inferences about a population by estimating the value of an
unknown parameter using an interval.
For example, suppose we want to estimate the mean summer income of a class of
business students. For n=25 studentsX (point estimate) is calculated to be 400 $/week. Interval
estimate is: the mean income is between $380 and $420 /week.
• Desirable qualities in Estimators include:
• Unbiasedness – An unbiased estimator of a population parameter is an estimator whose
expected value is equal to that parameter. E.g. the sample mXanis an unbiased estimator of the
population mean μ , since: EX ) = μ
• Consistency – An unbiased estimator is said to be consistent if the difference between the
estimator and the parameter grows smaller as the sample size grows larger. EXgis a consistent
2
estimator of because: VX ) iσ /n . That is, as n grows larger, the variance of X grows
smaller.
• Relative Efficiency – If there are two unbiased estimators of a parameter, the one whose
variance is smaller is said to be relatively efficient. Both the sample median and sample mean are
unbiased estimators of the population mean. However, statisticians have established that the
sample median has a greater variance than the sample mean, so we chooseX since it is relatively
efficient when compared to the sample median.
• We can calculate an interval estimator from a sampling distribution by:
o Drawing a sample of size n from the population
o Calculating its mean: And, by the central limit theorem, we know that X is normally
X
(or approximately normally) distributed so will have a standard normal (or approximately
X − μ
normal) distribution. = σ / n
• Confidence Interval Estimator of μ: LCL and UCL: X ± Z σ
α /2 n
o A larger confidence level produces a wider confidence interval.
o Larger values of variance σ produce wider confidence intervals.
o Increasing the sample size decreases the width of the confidence interval while the
confidence level can remain unchanged. Note: this also increases the cost of obtaining
additional data.
• We can control the width of the interval by determining the sample size necessary to produce
narrow intervals. Suppose we want to estimate the mean demand “to within 5 units” (W) ; i.e. we σ
want to the interval estimate to beX ±5 , since: X ± Zα /2 , it follows that
n
σ
Z α /2 = 5 So
n
Z σ 2
• Sample Size to Estimate a mean: n = α /2
W Chapter 11: Introduction to Hypothesis Testing
• Five critical concepts in hypothesis testing:
1. There are two hypotheses, the null hypothesis (H ) an0 the
alternative hypotheses (H ).1
2. The testing procedure begins with the assumption that the null
hypothesis is true.
3. The goal is to determine whether there is enough evidence to infer
that the alternative hypothesis is true.
4. There are two possible decisions:
Reject H –0conclude that there is enough evidence to support the alternative
hypothesis.
Do not Reject H – Conclude that there is not enough evidence to support the
0
alternative hypothesis.
5. Two possible errors can be made:
Type I error: Reject a true null hypothesis. P(Type I error) = α
Type II error: Do not reject a false null hypothesis. P(Type II error) = β
H is true H is false
0 0
Reject H 0 Type I Error Correct Decision
P(Type I error) = α
Do not Reje

More
Less
Related notes for ADMS 2320

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.