Study Guides (248,398)
Canada (121,510)
York University (10,209)
ADMS 2320 (97)
Ying Kong (3)
Final

ADMS 2320 Chapters Summary.doc

38 Pages
1046 Views
Unlock Document

Department
Administrative Studies
Course
ADMS 2320
Professor
Ying Kong
Semester
Fall

Description
Chapter 1: What is Statistic? • Statistics is a way to get information from data. Data  Statistics  Information. • A Population is the group of all items of interest to a statistics practitioner. E.g. All York students. • Parameter – A descriptive measure of a population (mean, proportion, etc.). • A Sample is a set of data drawn from population. E.g. a sample of 500 students completes a survey. • Statistic – A descriptive measure of a sample (mean, proportion, etc.). • Because populations are very large, expensive, and very time consuming, we use statistics to make inferences about parameters, i.e., we can make an estimate prediction, or decision about a population based on sample data. • Descriptive Statistics are methods of organizing, summarizing, and presenting data in a convenient and informative way. These methods include: Graphical Techniques and Numerical Techniques. • Inferential Statistics is set of methods used to draw conclusions or inferences about characteristics of populations based on data from a sample. • Statistical Inference is the process of making an estimate, prediction, or decision about a population based on a sample. • Conclusions and estimates are not always going to be correct, so we build into statistical inference a measure of reliability: the confidence level and the significance level. • Confidence Level (1-α) is the proportion of times that an estimating procedure will be correct. • Significance Level (α) measures how frequently the conclusion will be wrong in the long run. • E.g. the poll is considered accurate within 3.4 percentage points, 19 times out of 20. Our Confidential level is 95% (19/20 =0.95), while our significance level is 5%. Chapter 2: Graphical and Tabular Descriptive Techniques • Descriptive Statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information is produced (Graphical Techniques and Numerical Techniques). • A Variable is some characteristic of a population or sample. E.g. Student grades ( X, Y, Z … ) • The Values of the variable are the possible values for a variable. E.g. Student marks (0 - 100) • Data are the observed values of a variable. E.g. Student mark: 75, 50, 92, 82, 45, 66 … • There are three types of data: Interval, Nominal, and Ordinal data. • Interval Data are real numbers (Quantitative or numerical), such as heights, incomes, distance … • The values of Nominal Data are categories (Qualitative or Categorical), e.g. sex: female = 1 male = 2 • Ordinal Data appear to be categorical in nature, but their values have an order (Ranking), e.g. poor = 1, fair good = 2, good = 3, very good = 4, excellent = 3, etc. • Data  Not Categorical  Interval ( real number; all calculation; treated as ordinal or nominal)  Categorical  Order  Ordinal (ranked order; Calculation only based on ordering process; May be treated as nominal but not interval)  No Order  Nominal (arbitrary numbers represent categories, Only calculate based on frequency; May not be treated as nominal or not interval • Single set of Nominal Data: Use Frequency distribution and Relative Frequency Tables, Bar and Pie Charts. E.g. Subject Frequency Relative Frequency 120 100 60, Accounting 100 50% y100 30% e80 60 100, Finance 40 20% e60 40 50% Management 60 30% F20 0 Accounting Total 200 100% AccountinFinanceManagement 40, Finance Subject 20% Management • Histogram is most important graphical method for single set of Interval Data; it doesn’t only summarize interval data but also help explain probabilities. o Number of classes intervals = 1 + 3.3 log (n) o Class width = (Largest observation + Smallest observation) / Number of Classes. Shapes of Histograms: • Symmetry: A histogram A histogram is said to be Skewness: A skewed histogram is one symmetric if, when we draw a vertical line down the with a long tail extending to either the right or the left. center of the histogram, the two sides are identical in shape and size: • Modality: A unimodal histogram is one with a single Bell Shape: A special of symmetric peak, while a bimodal histogram is one with two peaks. unimodal histogram. c1.00 1.00 • Ogive is a graph of a cumulative relative frequency distribution. u0.80 0.85 r v0.60 Relative Cumulative l 0.48 Class Limits Frequency Frequency Relative Frequency R0.40 0.30 t0.20 0-1500 60 60/200 = 0.30 60/200 = 0.30 u u0.00 0 1500-3000 35 35/200 = 0.18 0.30+0.18 = 0.48 C 0 1500 3000 4500 6000 3000-4500 75 75/200 = 0.38 0.48+0.38 = 0.85 We estimate that about 48% of the students have first 4500-6000 30 30/200 = 0.15 0.85+0.15 = 1.00 Total 200 income lower than $3000. ….85%...... $4500 • Two Nominal Variable  Contingency Table and Bar Chart (two-dimensional) • Two Interval Variable  Scatter diagram to explore the relationship between 2 interval data. The Independent variable ( X )  Horizontal axis. The Dependent variable ( Y )  Vertical axis. • Linearity (Linear Relationship)  Positive, Negative, or weak or non-linear relationship. Price House Price Car Income Income Size # of miles Height 800 700 • Cross-Sectional Data  Observations measured at the same 600 500 point in time. 400 IncomeTax 300 200 100 2000 2002 2004 2006 2008 • Time-Series Data  Observations measured at successive points in time  Line chart Chapter 3: Art and Science of Graphical Presentations • Graphical Excellence is achieved when 1. Large data sets are presented concisely and coherently. If small  Table; 1 or 2  Sentence 2. The message being presented by the graph is clear to readers. I.e. chart to replace 1000 words 3. The comparison of two or more variables is aided. I.e. one-variable graph  little information 4. The substance of the data, not the form of the graph is important. 5. There is no distortion and deception of the data and findings. • It’s important to be able to critically evaluate the graphically presented information because graphical techniques create a visual impression which is easy to distort. • Be wary of graphs without a scale on one axis; avoid being influence by caption. ○ Understand the information being presented. Focus on the numerical values the graph represents: absolute values or relative values (e.g. percentages, deltas)? • Are the horizontal or vertical axes distorted in any way? ○ Note the scales on both axes ○ Graphs with unmarked axes ○ The gaps along axes, the size of bar or chart • Writing a Report: 1. State your objective or purpose of the statistical analysis. 2. Describe the Experiment – ensure the proper conduct of experiment. 3. Results - describe using words, tables, and charts – be honest; don’t mislead reader 4. Discussion of limitations of statistical techniques. 5. (Discuss problems with the analysis – include violations. • Oral Presentation: 1. Know your audience: What kind of information they will be expecting? What is their level of statistical knowledge? 2. Restrict your points to the main study objectives: Don’t go into the details of your analysis 3. Stay within time limits: Respect your audience 4. Use graphs Use the graphical excellence ideas here to explain complex ideas 5. Provide handouts. Chapter 4: Numerical Descriptive Techniques Measures of Central Location Population Sample Example Note Size N n N = 6: {1 5 7 8 2 7 } x x 1+5+7+8+2+7 - Seriously affected by extreme value – Arithmetic μ = ∑ x = ∑ x = =5 outliers. Billionaire  average income Mean N n 6 - Valid: Only Interval data - Sort in order - Half of observation is smaller than 6 1 2 5 7 7 8 and half is greater than 6 Median - Middle number or Median = ( 5 + 7 ) / 2 = 6 - Valid: Ordinal or Interval (extreme) - average of middle two number - A set of observation occur 7 occurs twice - mainly useful for nominal data Mode most frequently. - large data set is more relevant - May have one, two, or more Mode = 7 - All occurs once: 0 or All R = (1+ R )(1+ R )...(1+ R )  Use when the variable is a growth rate or rate of change. Geometric g 1 2 st n nd st Mean E.g. 2 year investment: 1 year: 100% growth; 2 year: 50% loses from 1 year Rg= (1+1)(1− 0.5) −1 =1+1 = 0 ∑ x − x 4+0+ 2+3+3+ 2 MAD Mean Absolute Deviation : MAD = = = 2.33 n 6 If a distribution is symmetrical, If the distribution is non symmetrical, and skewed to the mean, median and mod may coincide the left or the right, the three measure differ. 1 k Interval Chebysheff Theorem 1− 2 Empirical Rule (Bell shape) k 1 x −1s, x +1s At least 0% (1 – 1 / 1 ) = 0 Approximately 68% 2 2 x −2s, x + 2s At least 75% (1 – 1 / 2 ) = 0.75 Approximately 95% 3 x −3s, x +3s At least 88.9% (1 – 1 / 3 ) = 0.889 Approximately 99.7% E.g. Histogram is Bell Mean = 10% , Standard Deviation = 8%, According to Empirical Rules: - Approximately 68% is between 2 and 18 …. App. 99.7% is between -14 and 34.  (10 – 3x8), (10+3x8) E.g. Positive Skewed Histogram of salaries, Mean = $280, Standard Deviation = $30, according to Chebysheff: - At least 75% of the salaries lie between $220 and $340 …. At least 88.9% is between $190 and $370 Measures of Variability Example: - Advantage is simplicity Range Largest observation – Smallest observation 6: {1 5 7 8 2 7 } Disadvantage: two Range: 8 – 1 = 7 completely different dataset may have the same range. (x − μ) 2 n X x2 Population:σ = ∑ i 1 1 1 1  (30)2 N 2 5 25 s = 192−  2 3 7 49 6−1  6  2 ∑ (xi− x) 4 8 64 2 Variance Sample: s = 5 2 4 1  (30)  n −1 = 192−  Shortcut of Sample Variance: 6 7 49 5  6   2 30 192 1 s = 1  x − (∑ xi)  = 192−150 =8.4 n −1 ∑ i n 5   ==> Population:σ = σ 2 s = 8.4 = 2.9 Standard When we compare two standard deviations with Deviation similar means, smaller number means more Sample: 2 consistent, stable, etc. while larger standard s = s deviation show higher risks, less consistency. σ Mean = 5, Standard deviation = 2.9 Population:CV = μ 2.9 Coefficient ==> cv = =.58 of Variation s 5 Sample: cv = This coefficient provides a proportionate measure of x variatioSmaller  consistency; largerinconsistency • If data are symmetric, with no serious outliers, use range and standard deviation. • If comparing variation across two data sets, use coefficient of variation. • Measures of Relative Standing are designed to provide information about the position of particular values relative to the entire data set. The Percentile (P ) is the value for which P percent are less than th that value and (100-P)% is greater than that value. E.g. you score in the 60 percentile means 60% of other score were below you, and 40% of scores were above you. • First (lower) decile = 10th percentile • First (lower) quartile, Q1, = 25th percentile • Second (middle) quartile,Q2,= 50th percentile = Median • Third quartile, Q3, = 75th percentile • Ninth (upper) decile = 90th percentile Interquartile range = Q 3 Q 1 (measure the spread of the middle 50% of observation) P Location of a Percentile:L p (n +1) , where Lpis the location of the Pth percentile 100 E.g. {0 1 2 3 4 5 6 7 8 975 L = (10+1) (75/100) = 8.25  The third quartiles or 75 percentile = 7 + (8-7) 0.25 = 7.25 Measures of Linear Relationship provide information as to the strength and direction of a linear relationship between two variables (if one exists). Popular: E.g. x is hours of study and y is grade ∑ (x i μ )xy −iμ ) y 2 2 σ xy = n x y x y xy N 1 0 40 0 1600 0 Sample: 2 5 55 25 3025 275 3 8 65 64 4225 520 Covariance ∑ (xi− x)(y i y) sxy 4 10 70 100 4900 700 n−1 5 15 85 225 7225 1275 Shortcut for Sample Covariance: 6 20 100 400 10000 2000 ∑ 1  ∑ ∑ xi yi 58 415 814 30975 4770 s xy ∑ xiy i  1  58×415  n−1 n sxy = 4770 − = =151.6667   6−1  6  σ 2 ρ = xy −1≤ ρ ≤ +1 2 1  2 ∑ xi  1  (58) Popular: σ σ x = n −1 xi− n = 6 −1814 − 6 = 50.6667 x y   1  ( y)2 1  (415) s xy y = ∑ yi− ∑ i = 30975−  = 454.1667 Sample: r = −1≤ r ≤ +1 n −1 n  6−1  6  sx y Coefficent of sx= s = x0.6667 = 7.1181 Correlation r = + 1 Strong positive linear relationship r = 0 No linear relationship 2 sy= s = y54.1667 = 21.3112 r = – 1Strong negative linear relationship r = 0.56 Moderately strong positive sxy 151.6667 r = – 0.1 Weak negative ….. r = = = 0.9998 sx y 7.1181×21.3112 Chapter 6: Probability • A Random Experiment is an action or process that leads to one of several possible outcomes. The listed outcomes must be exhaustive (All possible outcomes included), and the outcomes must be mutually exclusive (No two outcomes can occur at the same time.) • Sample Space of a random experiment is a list of exhaustive and mutually exclusive outcomes. The probability of any outcome is between 0 and 1, and the sum of the probabilities of all the outcomes equals 1. S = {O ,1O , 2, O } k 0 ≤ P(O) ≤i1 ∑ P(O )i=1 • There are three ways to assign a probability to an outcome: o Classical Approach: counting approach used effectively in games of chance. o Relative Frequency: assigning probabilities based on experimentation or historical data. o Subjective Approach: assigning probabilities based on the assignor’s judgment. • An Event is a collection or set of one or more simple events in a sample space. A Simple Event is individual outcome of a sample space. • The Probability of an Event is the sum of the probabilities of the simple events that constitute the event. • The Complement of Event A is defined to be the event consisting of all sample points that are “not in A” or A . P(A) + P(A ) = 1 c • The Intersection of Event (A and B) A and B is the set of all sample points that are in both A and B. The Probability of the intersection is called the Joint Probability: P (A and B). • When two events are Mutually Exclusive (that is the two events cannot occur together), their joint probability is 0. • Marginal Probabilities are computed by adding across rows and down columns. B B P(A) 1 2 i A 2 .20 .35 .55 Joint Marginal Probabilities A 2 .15 .30 .45 Probabilities P(B)i .35 .65 1.00 • The Conditional Probability is used to determine how two events are related. We can determine P(A and B) the probability of event A given the occurrence of event B: P(A| B) = P(B) • If two events A and B are Independent – P(A) is not affected by P(B), then P(A|B) = P(A) • The Union of events A and B is the event that occurs when either A or B or both occur. • Complement Rule: P(A ) = 1 – P(A) P(A and B) • Conditional Probability: P(A| B) = P(B) • Multiplication Rule: P(A and B) = P(A|B)  P(B) or P(A and B) = P(B|A)  P(A) • Multiplication Rule for Independent Events: P(A and B) = P(A)  P(B) • Addition Rule: P(A or B) = P(A) + P(B) – P(A and B) • Addition Rule for Mutually Exclusive Events: P (A or B) = P(A) + P(B) Joint eventProbabilities B|A 0.75 A and B (0.90)(0.75) = 0.675 A 0.90 C|A 0.20 A and C (0.90)(0.20) = 0.180 D|A 0.05 A and D (0.90)(0.05) = 0.045 • Probability B|A0.60 A and B (0.10)(0.60) = 0.060 C C C A 0.10 C|A0.30 A and C (0.10)(0.30) = 0.030 D|A0.10 A and D (0.10)(0.10) = 0.010 Chapter 7: Random Variables and Discrete Probability Distributions • A Random Variable is a function or rule that assigns a number to each outcome of an experiment. There are two types of random variables: Discrete Random Variable (Integer or countable, 1,2,3…) and Continuous Random Variable (Real number or uncountable, time, distance, etc. ) • A Probability Distribution is a table, formula, or graph that describes the values of a random variable and the probability associated with these values.  P(x) or P(X=x) • Requirement for a Distribution of a Discrete Random Variable: 0 ≤ P(x) ≤ 1 and P(x) =1 ∑ • Population Mean is the weighted average of all of its values. The weights are the probabilities. This parameter is also call Expected Value of X: E(X) = μ = ∑ xP(x) 2 2 2 • Population Variance: V(X) = σ =2 ∑ (x − μ) P(x) = ∑ x P(x) − μ (Shortcut) • Population Standard Deviation: σ = σ 2 E.g # of Course # of Student X P(X) X P(x) X  P(x) X2 X  P(x) 0 5 0 0.05 0 0.05 0.00 0 0.00 1 10 1 0.10 1 0.10 0.10 1 0.10 2 20 2 0.20 2 0.20 0.40 4 0.80 3 40 3 0.40 3 0.40 1.20 9 3.60 4 15 4 0.15 4 0.15 0.60 16 2.40 5 10 5 0.10 5 0.10 0.50 25 2.50 2.80 9.40 E(X) = μ = ∑ xP(x) = 2.8 2 2 V(X) = σ =2 ∑ x P(x) − μ = 9.4 – (2.8) = 1.56 σ = σ = 1.56 =1.249 • Laws Expected Value: • Laws Variance: − E(c) = c − V(c) = 0 − E(X + c) = E(X) + c − V(X + c) = V(X) − E(cX) = c E(X) − V(cX) = c V(X) E.g. Monthly sales have a mean of is $25,000 and standard deviation of $4,000, profits 30% of sales subtract $6,000. Mean of Monthly Profits: E(Profit) = E[.30 (Sales) – 6000] = E[.30 (25000)] – 6000 = .30  E(25000) – 6000 = $1,500  Standard Deviation of Monthly Profits: V (Profit) = V[.30 (Sales) – 6000] = 0.30 2 V(Sales) = 0.09 X 4000 = 1440000  Standard Deviation of Monthly Profits: σ = σ = 1440000 =1200 • Binomial Distribution is the probability distribution that results from doing a binomial experiment. Binomial Experiments have the following properties: 1. Fixed number trials, represented by n. 2. Each trial has two possible outcomes, a success and a failure. 3. P(success) = p P(failure) = 1 – p 4. The trials are independent, which means that the outcome of one trial does not affect the outcomes of any other trials. n! • Binomial Probability Distribution: P(x) = p (1− p) n−x x!(n− x) • The Binomial Table gives cumulative probabilities for P(X ≤ k)  P(X = k) = P(X ≤ k) – P(X ≤ [k – 1]) E.g. P(X = 5) = P(X ≤ 5) – P(X ≤ 4) P(X < 5) = P(X ≤ 4) P( 3 ≤ X ≤ 6) = P(X ≤ 6) – P(X ≤ 2) P(X > 5) = P(X ≥ 6) = 1 – P(X ≤ 5) • Binomial Random Variable: o Mean or Expected Value: μ = np o Variance: σ = np (1-p) o Standard Deviation: σ = np(1− p) Chapter 8: Continuous Probability Distribution • A Continuous Random Variable is one that can assume an uncountable number of values. So, we cannot list the possible value because infinity number, and the probability of each individual values is 0, i.e., Point Probability is Zero, e.g. P(X = 5) = 0 • A function f(x) is called Probability Density Function (over range a ≤ x ≤ b) if it meets the following requirements: o F(x) ≥ 0 for all x between a and b o The Total area under the curve between a and b is 1 The Uniform Probability Distribution (the Rectangular Probability Distribution) is described by 1 the function: f (x) = where a ≤ x ≤ b b −a 1 1 b − a P(x 1 x ≤ x )2= width X height = P(x 1 x ≤ x )2= (x −2x )1 x x b−a 1 2 • The Normal Distribution is the most important of all probability distributions because its crucial role in statistical inference. It look like Bell Shaped, and Symmetrical around the mean. • The Probability desity function of a Normal Random Variable is: − (x−μ)2 f (x) = 1 e 2 σ −∞ < x < ∞ , e = 2.71828 π = 3.14159 σ 2π Normal Distribution is fully defined by standard deviation (σ) and Mean (μ). Unlike the range of the uniform distribution (a ≤ x ≤ b), Normal Distribution range from minus infinity to plus infinity (∞ < x < ∞ ) • A normal distribution whose mean is zero (μ=0) and standard deviation is one (σ=1) is called the Standard Normal Distribution. Any normal distribution can be converted to a standard normal distribution with simple algebra which makes calculations much easier. • Normal Distribution with same standard deviation, increasing mean shift the curve to the right. Normal Distribution with same means, increasing standard deviation flattens the curve. • We can calculate Normal Probabilities by converting normal random variable to a Standard X − μ σ = 10 Normal Random Variable: Z = 9 4 σ . . E.g. μ = 50, σ = 10 μ X = 45 50 60 P(45 < X < 60) = P( 45−50 < X − μ < 60−40 ) = P(−.5< Z <1) Z = -.5 0 1 10 σ 10 σ = 10 P(−.5 < Z <1) =.1915+.3413=.5328 P(Z < −.5) = P(Z >.5) = 0.5−.1915=.3085 μ X = 45 50 60 P(Z >1) = 0.5−.3413=.1587 Z = -.5 0 1 P(55< X < 60) = P(.5 < Z <1) =.3413−.1915 =.1498 σ = 10 • Finding Value of Z: often we’re asked to find some value of Z for μ a given probability, i.e., given an area (A) under the curve, X = 45 50 55 60 what is the corresponding value of Z (Z ).A Z = -.5 0 .5 1 We can find Z by “Reverse look up” on table for .475 and we find: A Z .025 1.96 P(-1.96 < Z < 1.96) = .95 Z .05.645 P(-1.645 < Z < 1.645) = .90 Area= .025 Area= .5-.025 = .475 ZA Z.012.33 P(-2.33 < Z < 2.33) = .98 Chapter 5: Data Collection and Sampling Three Methods of Collection Data • Direct Observation – measurements representing a variable of interest are observed and recorded, without controlling any factor that might influence their values. It is inexpensive but difficult to product useful information because result can be affect by other hidden factors. E.g. Aspirin  Heart Attacks (Health conscious) • Experiments – measurements representing a variable of interest are observed and recorded, while controlling factors that might influence their values, usually more expensive and direct observation. E.g. Regular Aspirin and Not Regular Aspirin  Heart Attacks. • Surveys – solicit or ask for information from people. E.g. Gallup polls and Harris Survey, pre- election polls. Majority of surveys are conducted for private use. E.g. Marketing surveys. • Key Survey parameter is the Response Rate (i.e. the proportion of all people selected who complete the survey) • Surveys may be administered in a variety of ways ○ Personal Interview – High expected response rate, lower misunderstanding of question, but expensive and easy to be bias if interviewer not train properly. ○ Telephone Interview – Less expensive, But lower response rate ○ Self Administered Questionnaire – By mail, inexpensive, But lower response rate and high number of incorrect response (misunderstanding of question) • Key Questionnaire Design Principles o Keep the questionnaire as short as possible. o Ask short, simple, and clearly worded questions. o Demographic Questions: 2 schools of thought  Keller recommends: Start with demographic questions to help respondents get started comfortably  Other experts: recommend these questions be asked at the end). o Use dichotomous (yes|no) & multiple choice questions o Use open-ended questions cautiously o Avoid using leading-questions o Pretest a questionnaire on a small number of people. o Think about the way you intend to use the collected data when preparing the questionnaire. • Statistical inference permits us to draw conclusions about a population based on a sample. • Sampling – selecting a sub-set of a whole population is often done for reasons of: o cost (it’s less expensive to sample 1,000 TV viewers than 100 million TV viewers) o practicality (e.g. performing a crash test on every automobile produced is impractical). • In any case, the sampled population and the target population should be similar to one another. • A sampling plan is just a method or procedure for specifying how a sample will be taken from a population. Our focus will be on these three methods: o Simple Random Sampling (SRS) is a sample selected in such a way that every possible sample of the same size is equally likely to be chosen.  Use of a random numbers table or a software package Example: drawing three names from a hat containing all the names of the students in the class. Any group of three names is as equally likely as picking any other group of three names. o Stratified Random Sampling is a stratified random sample is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum.  Example: Male students = 2000, female = 4000,  Sample = 300  male =100 (33%) + female = 200 (66%) o Cluster Sampling is a simple random sample of groups or clusters of elements (vs. a simple random sample of individual objects).  This method is useful when it is difficult or costly to develop a complete list of the population members or when the population elements are widely dispersed geographically.  Two stage process: • Researcher draws a random sample of clusters • Researcher draws a random sample of elements within each selected cluster  Can involve multiple stages, with clusters within clusters  Cluster sampling may increase sampling error due to similarities among cluster members. • The larger the sample size is, the more accurate we can expect the sample estimates to be. • Two major types of error can arise when a sample of observations is taken from a population: • Sampling error refers to: o differences between the sample and the population that exist only because of the observations that happened to be selected for the sample o the differences in results for different samples (of the same size) is due to sampling error o Increasing the sample size will reduce this type of error. o Example:  Two samples of size 10 of 1,000 households  If we happened to get the highest income level data points in our first sample and all the lowest income levels in the second, this delta is due to sampling error. • Non-sampling error are more serious o due to mistakes made in the acquisition of data o due to the sample observations being selected improperly. o Increasing the sample size will not reduce this type of error. • Three types of nonsampling errors: o Errors in data acquisition arises from the recording of incorrect responses due to:  incorrect measurements being taken because of faulty equipment  mistakes made during transcription from primary sources  inaccurate recording of data due to misinterpretation of terms  inaccurate responses to questions concerning sensitive issues o Nonresponse errors refers to error (or bias) introduced when  responses are not obtained from some members of the sample, i.e. the sample observations that are collected may not be representative of the target population  Response Rate (i.e. the proportion of all people selected who complete the survey) is a key survey parameter and helps in the understanding in the validity of the survey and sources of non-response error. o Selection bias occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample o Example: 1936 Literary Digest Poll predicted a Republican candidate, Alfred Landon would defeat Democrat incumbent, Franklin D. Roosevelt, by a 3 to 2 margin. Outcome: Roosevelt defeated Landon by a support of 62% of electorate. o 2 mistakes made in sampling procedure  Ten million sample ballots sent out to prospective voters. Most of the names were taken from the Digest’s subscription list and from telephone directories (subscribers to magazine & people with telephones tended to be wealthier than average, and such people then, as today, tended to vote Republican)  Only 2.3 million ballots were returned (resulting in a self-selected sample). Chapter 9: Sampling Distribution • A sampling distribution is created by, as the name suggests, sampling. • The method we will employ on the rules of probability and the laws of expected value and variance to derive the sampling distribution. • Sampling Distribution of the Sample Mean E(X) = μ = μ = xP(x) • x ∑ • V(X) =σ =σ /n 2 σ =σ / n where σ = ∑ x P(x) − μ2 x x • If X is normal,X is normal. If X is nonnormalX is approximately normal for sufficiently large sample sizes. o Note: the definition of “sufficiently large” depends on the extent of nonnormality of X (e.g. heavily skewed; multimodal) o If the population is normal, theX is normally distributed for all values of n o If the population is non-normal, thenX is approximately normal only for larger values of n o In many practical situations, a sample size of 30 may be sufficiently large to allow us to use the normal distribution as an approximation for the sampling distributiXn.of • The standard deviation of the sampling distribution is called the standard error: • Remember that μ and σ 2are the parameters of the population of X • To create the sampling distribution of X, we repeatedly drew samples of size 2 from the population and calculated for each sample Thus, we treXt as a brand new random variable, with its own distribution, mean, and variance • Central Limit Theorem: The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distributioX owill resemble a normal distribution. • Statisticians have shown that the mean of the sampling distribution is always equal to the mean of the population and that the standard error is eqσ / no for infinitely large populations. σ N − n However, if the population is finite the standard errox is n N −1 where N is the population size and N −n is called the finite population correction factor. If the N −1 population size is large relative to the sample size the finite population correction factor is close to 1 and can be ignored. • As a rule of thumb we will treat any population that is at least 20 times larger than the sample size as large. In practice most applications involve populations that qualify as large. As a consequence the finite population correction factor is usually omitted. • The sampling distribution can be used to make inferences about population parameters. In order to do so, the sample mean can be standardized to the standard normal distribution using the X − μ following formulation: Z = σ / n P(μ − Z σ < X < μ + Z σ ) =1−α • Another way to state the Probability: α /2 n α /2 n Sample Distribution of a Proportion • The estimator of a population proportion of successes is the sample proportion. That is, ˆ X • we count the number of successes in a sample and compute: P = n • Using the laws of expected value and variance, we can determine the mean, variance, and • standard deviation of . (The standard deviation of is called the standard error of the proportion.) ○ E(P) = μ =pp ˆ 2 p(1− p) ○ V(P) =σ = Pˆ n ○ σ = p(1− p)/n Pˆ • Sample proportions can be standardized to a standard normal distribution using this formulation: P − p o Z = p(1− p)/n • The final sampling distribution introduced is that of the difference between two sample means. • This requires: independent random samples be drawn from each of two normal populations • If this condition is met, then the sampling distribution of the difference between the two sample means, i.e. X − X will be normally distributed. 1 2 • Note: If the two populations are not both normally distributed, but the sample sizes are “large” (>30), the distribution of X1− X 2 is approximately normal. • The expected value and variance of the sampling distribution of are given by: • Mean: E(X −1X ) 1 μ 1 2x= μ 1 μ 2 2 2 2 σ 1 σ 2 • Variance: V(X −1X ) 1σ 1 −2= + (also called the standard error if the difference n1 n2 between two means) σ 1 σ 2 • Standard deviation: σ 1 −2= + n1 n 2 (X 1 X )−2μ − μ1) 2 Z = 2 2 • We can compute Z (standard normal random variable) in this way: σ1 + σ 2 n n 1 2 Chapter 10: Introduction to Estimation • Statistical inference is the process by which we acquire information and draw conclusions about populations from samples. There are two general procedures for making inferences about population: Estimation and Hypothesis. • The objective of Estimation is to determine the approximate value of a population parameter on the basis of a sample statistic. E.g., sample mean ( ) is employed to estimate the population mean (μ). • A Point Estimator draws inferences about a population by estimating the value of an unknown parameter using a single value or point. Drawbacks: (1) The point probabilities in continuous distributions were virtually zero. (2) We’d expect that the point estimator gets closer to the parameter value with an increased sample size, but (3) point estimators don’t reflect the effects of larger sample sizes. • An Interval Estimator draws inferences about a population by estimating the value of an unknown parameter using an interval. For example, suppose we want to estimate the mean summer income of a class of business students. For n=25 studentsX (point estimate) is calculated to be 400 $/week. Interval estimate is: the mean income is between $380 and $420 /week. • Desirable qualities in Estimators include: • Unbiasedness – An unbiased estimator of a population parameter is an estimator whose expected value is equal to that parameter. E.g. the sample mXanis an unbiased estimator of the population mean μ , since: EX ) = μ • Consistency – An unbiased estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size grows larger. EXgis a consistent 2 estimator of because: VX ) iσ /n . That is, as n grows larger, the variance of X grows smaller. • Relative Efficiency – If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to be relatively efficient. Both the sample median and sample mean are unbiased estimators of the population mean. However, statisticians have established that the sample median has a greater variance than the sample mean, so we chooseX since it is relatively efficient when compared to the sample median. • We can calculate an interval estimator from a sampling distribution by: o Drawing a sample of size n from the population o Calculating its mean: And, by the central limit theorem, we know that X is normally X (or approximately normally) distributed so will have a standard normal (or approximately X − μ normal) distribution. = σ / n • Confidence Interval Estimator of μ: LCL and UCL: X ± Z σ α /2 n o A larger confidence level produces a wider confidence interval. o Larger values of variance σ produce wider confidence intervals. o Increasing the sample size decreases the width of the confidence interval while the confidence level can remain unchanged. Note: this also increases the cost of obtaining additional data. • We can control the width of the interval by determining the sample size necessary to produce narrow intervals. Suppose we want to estimate the mean demand “to within 5 units” (W) ; i.e. we σ want to the interval estimate to beX ±5 , since: X ± Zα /2 , it follows that n σ Z α /2 = 5 So n Z σ 2 • Sample Size to Estimate a mean: n =  α /2  W  Chapter 11: Introduction to Hypothesis Testing • Five critical concepts in hypothesis testing: 1. There are two hypotheses, the null hypothesis (H ) an0 the alternative hypotheses (H ).1 2. The testing procedure begins with the assumption that the null hypothesis is true. 3. The goal is to determine whether there is enough evidence to infer that the alternative hypothesis is true. 4. There are two possible decisions:  Reject H –0conclude that there is enough evidence to support the alternative hypothesis.  Do not Reject H – Conclude that there is not enough evidence to support the 0 alternative hypothesis. 5. Two possible errors can be made:  Type I error: Reject a true null hypothesis. P(Type I error) = α  Type II error: Do not reject a false null hypothesis. P(Type II error) = β H is true H is false 0 0 Reject H 0 Type I Error Correct Decision P(Type I error) = α Do not Reje
More Less

Related notes for ADMS 2320

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit