Study Guides (248,644)
Canada (121,651)
Statistics (161)
STAT 206 (2)

STAT 206 Fall 2013 Course Notes

18 Pages
215 Views

Department
Statistics
Course Code
STAT 206
Professor
Eddie Dupont

This preview shows pages 1,2,3,4. Sign up to view the full 18 pages of the document.
Description
STAT 206 - Statistics for Engineers Kevin James Fall 2013 Introduction Statistics in the collection, organization, analysis, iterpretation, and presentation of data. In e▯ect, it is a quanti▯cation of uncertainty. Process To conduct an empirical study or statistical study, we must ▯rst identify the population population: the set of elements your query pertains to of interest. Individual items of this population are called units units: a single element, usually a person or object, whose characteristics we are interested in We also de▯ne the hypothesis, or question we would like answered. We select a subset of units from the population to have in our sample sample: a subset of the population from which measurements are actually made which must have a pre-determined size and should make an attempt to reduce or eliminate sample error sample error: an error which occurs randomly due to the uncertainty of the sample We also must determine how we can measure the variable of interest variable of interest: a measure of the interesting characteristic of a unit This variable can often be measured in a multitude of ways, though many of which will be somewhat lacking in value. You must take into account not only what this variable is and how it is collected, but also ways to minimize bias, such as by randomizing and repeating your experiments. We should also attempt to avoid study errors study errors: systematic errors which occur because the sample does not accurately re ect the population 1 or else we will ▯nd ourselves with a large amount of error and/or uncertainty. Post-experiments, we need to analyze our data and come to a conclusion. It is generally a goo didea to graph the data, as this gives us a highly visual method of analysis. We can use two main branches of statistics to analysyze our data: descriptive statistics descriptive statistics: a summary of the collected data, both visuallt and numerically or inferential statistics inferential statistics: generalized results for the population based on the sample data We will be focusing on inferential statistics, which inlude a quanti▯cation of uncertainty, in this course. Finally, we use the results of our study to answer the original hypothesis or research question. We also must be sure to address the limitations of our study. Types of Variables Our variables may be either categorical categorical: a qualitative measure belonging to one of K possible classes discrete discrete: a quantitative measure with some countable value or continuous continuous: a quantitative measure with some uncountable value, such as a range of values Plots We can design a stem-and-leaf plot by writing all ▯rst digits in a single column and all of the other digits in the corresponding right-hand side. For example, for a standard bell-curve grading scheme: 4 | 24 5 | 0068 6 | 24556 7 | 4556678889 8 | 00022223334558 9 | 0334469 We can also use grouped frequency tables by using frequency bins, for example Average Frequency 90+ 18 80+ 43 70+ 87 60+ 92 Histograms follow a similar pattern, since we select bins such as 40-49, 50-59, 60-69, 70-79, 80-89, 90-100 and diagram the amount in each bin. If we have di▯erently sized bins (e.g. 1, 2, 3-4) we want to examine the "area" of the bars instead of their "height". 2 Measures of Certainty The sample mean is a set of n values and is denoted X n x i x = i=1 n The median is the number x such that half the values are below and half are above. If we denote the ithsmallest value as x , then i ▯ x = x n+1 2 if n is odd, or xn+2 + xn x = 2 2 2 if n is even. Measures of Dispersion The sample variance of a set of n values is denoted by Xn (xi▯ x)2 s = i=1 n ▯ 1 The standard deviation, denoted s, is the square root of the sample variance. The range of the set is the di▯erence between the maximum and minimum values. If we create a graph with the median, mean, average variance, etc, it is called a box-and-whiskers plot. Probability Classical probability is the "common sense" probability related to discrete events such as coin ips, dice rolls, etc. Though useful, this form of probability has some severe limitations: namely, the de▯nition of what "equally likely" actually means. In e▯ect, we can use this type of probability to ▯nd an answer, but can not use that answer for anything. Relative frequency probability is slightly more useful: we repeat an experiment some number of times and record the relative chance of various outcomes. This type of probability analysis, however, is extremely impractical. Finally, we have subjective probability, which is based on a person’s experiences and subjective knowledge. Obviously, this method also has some severe limitations and is far too abstract to be used scienti▯cally. When discussing probabilty, we always refer to experiments experiments: a repeatable phenomenon or process or their various trials 3 trials: an iteration of an experiment These experiments have a sample space sample space: set of discrete outcomes for an experiment which is obviously either discrete or continuous, depending on whether or not this range is countable. We will be attaching a mathematical model to the sample space to have our our de▯nition of probability. Any probablity model must obey the following axioms: ▯ 0 ▯ P(A) ▯ 1;A 2 S ▯ P(S) = 1 ▯ P(A [ B) = P(A) + P(B) for any mutually exclusive outcomes for any sample space S and potential outcomes A and B. The classical model would suggest that for a sample set S = fa;b;cg, each outcome has a probability P = .3This is referred to as a uniform distribution, and is incorrect for most non-trivial samples. Permutations and Combinations A common problem requires we create an arrangement using r of n objects. In such a set, the number of permutations is equal to (r) n! n = (n ▯ r)! If we don’t care about the order of the arrangement, we can use the formula for a combination. The way to ▯nd r of n items is ▯ ▯ n = n! r r!(n ▯ r)! Set Operations A \ B is the intersection of two events, event A and event B. In other words, this is the probability that both events will occur. It is also written as AB. Note that if P(A \ B) = 0, the two events are mututally exclusive. P(A [ B) is the union of events A and B, and is de▯ned by P(A [ B) = P(A) + P(B) ▯ P(\B). This is the probability of one or the other happening. We also de▯ne the complement of A, A = 1▯P(A). This is the probability of the event not occuring. We de▯ne conditional occurances P(A\B)he following notation: the probability of A conditional on the probability of B is P(AjB) = P(B) . Obviously, if the probability of B is zero, this is non-sensical. Two events are independant if and only if P(A\B) = P(A)P(B). Note that this will also tell us that P(AjB) = P(A) and vice-versa (the probability of A/B is the same regardless of whether the other has occured. 4 Law of Total Probability If we have some distinct partition of our sample set su0h th1t B [ n [ ▯▯▯B , then for any event A Xn we can ▯nd P(A) = P(AjB iP(i) i=0 Bayes’ Theorem For any two events in a sample set P(AB) P(AjB)P(B) P(BjA) = = P(A) P(AjB)P(B) + P(AjB)P((B) Discrete Random Variables A random variable is one which may have any value, where R(X) is the range of possible values it can take. We denote random variables with upper-case letters and denote observed variables as lower-case letters. If these variables can take on only two possible values, we refer to them as binary. We denote the probability distribution (i.e. the chance of some random variable being equal to a certain variable) as f(x) = P(X = x). The sum of the probability distributions of X for all possible x is equal to 1. We also de▯ne the cumulative distribution function as F(x) = P(X ▯ x). The mean or expected value of a random variable X is de▯ned as X ▯ = E(X) = xf(x) x This function is linear, thus we have E(aX + bY ) = aE(X) + bE(Y ) Variance is the square of the expected di▯erence X 2 2 V ar(X) = E((X ▯ E(X)) ) = f(x)(x ▯ ▯) x We sometimes denote this as V ar(X) = E(X ) ▯ E(X) . Bernoulli Distributions A Bernoulli distribution will be formed when an experiment is repeated several times. The outcomes of each tril must be independant though the probability of any given outcome must be identical over all experiments. Results must be binary. 5 We say that X follows a Bernoulli distribution (X Bernoulli(p)), where p is the probablity of success, if ( p if x = 1 f(x) = 1 ▯ p if x = 0 Note that for all Bernoulli distributions E(X) = p and V ar(X) = p(1 ▯ p). Let X be the number of successes obtained from a sequence of n Bernoulli trials. X follows a Binomial distribution (X Bino(n;p)) if ▯ ▯ n x n▯x P(X = x) = f(x) = x p (1 ▯ p) We also have E(X) = np and V ar(X) = np(1 ▯ p). When solving for a Bernoulli distribution, we may ▯nd that we give ourselves arti▯cial boundaries. For example, if we have n = 24, we can not solve for p > 24. In this case, we can use limits to ▯nd the p1 correct answer (i.e. n = 24z, pz=). Binomial Theorem For any positive integer n and real numbers a;b, Xn ▯n▯ (a + b) = a bn▯x x x=0 Poisson Process In a Poisson Process, events occur randomly in time or space according to the following conditions: ▯ Independence - the number of events in disjoint (i.e. non-overlapping) intervals are independant ▯ Individuality - events occur singlely (i.e. no two events can occur at a time) ▯ Homogeneity - events occur according to a uniform (constant) rate or intensity (▯) If events occur with an average rate of ▯ per unit of time and X is the number of events which occur in t units of time, then X Poisson(▯t) gives us e▯▯(▯t)x f(x) = x! for any x = 0;1;2;▯▯▯. We can also de▯ne E(X) = V ar(X) = ▯t. Hypergeometric Distribution If we have a collection of N objects which can be sorted into two distinct types (success and failure), there exist r successes and N ▯r failures. Assume we select a sample of n objects without replacements. 6 Then let X be the number of successes selected. X is said to follow a hypergeometric distribution X Hyper(N;r;n) if ▯ ▯▯ ▯ r N ▯ r x n ▯ x f(x) = ▯ ▯ N n nr nr(N▯r)(N▯n) for any x = 0;1;▯▯▯ ;min(r;n). We can also de▯ne E(X) = N and V ar(X) = N (N▯1) . Geometric Distribution In this case, Bernoulli trials are repeated until the ▯rst success. For X is the number of independant Bernoulli(p) until the ▯rst success, then X Geometric(p) if x▯1 f(x) = p(1 ▯ p) for any x = 1;2;3;▯▯▯ 1 1▯p We also de▯ne E(X) = p and V ar(X) = p2 . Continuous Probability The sample space for continuous probability is all open intervals. In e▯ect, the probability of an event occuring within a given interval is proportionate to the length/size of that interval (for uniform distributions). Note that since ranges have probability (e.g. in total range 100, the sub-range 1 ▯ 8 has probability 8▯1 = :07), individual elements must have 0 probability. 100 The probablity density function of a continuous random variable X describes the probability that X takes on a value in the range (a;b) Z b f(x) ▯ dx = P(a < X < b) a The cumulative density function is the probability that X < x, or F(x) = P(X < x). Note that these two density functions are related in the following way: Z b f(x) ▯ dx = P(a < X < b) a = P(X < b) ▯ P(X < a) = F(b) ▯ F(a) or in other words: pdf(a;b) = cdf(b) ▯ cdf(a). 7 Uniform Distribution A continuous random variable is said to have a uniform distribution if the probability of a given subin- terval is proportional to the length of that interval. If X is uniformly distributed over (a;b) we write X Unif(a;b), which gives us f(x) =1 . b▯a Normal Distribution A random variable X has a normal distribution X N(▯;▯ ) if the pdf takes the form 1 ▯2(▯▯)2 f(x) =p 2e 2▯▯ where the expected value is ▯ and the variance is ▯ . If X has a normal distribution, then Z =▯ = N(0;1). Also note that the normal distribution is ▯ symmetrical, i.e. P(Z > z) = P(Z < ▯z). To solve a normal distribution, we reduce it to P(Z < x), where x 2 R, and look up the answer in a normal distribution table. Binomial Distribution For any X Binomial(n;p), if np > 5 and n(1 ▯ p) > 5 (i.e. n is large and p is near 0:5), we have Z = p X ▯ np ▯N(0;1) np(1 ▯ p) Because we are using a continuous distribution to approximate a discrete distribution, we include a continuity correction: ▯ ▯ (a + :5) ▯ np P(a < X) ▯ P p < Z np(1 ▯ p) ▯ ▯ (b ▯ :5) ▯ np P(X < b) ▯ P Z < p np(1 ▯ p) Sampling Statistical Interference The goal of statistical inference is to draw conclusions about a population, given only a small sample of said population. We achieve a random subset of a population using random sampling. 8 Random Sampling There are many common types of random sampling: ▯ For Simple Random Sampling, each unit in the population has the same chance of being selected. ▯ If we have distinct groups, Strati▯ed Random Sampling may be an excellent option. .We ▯rst divide the population into K distinct samples; from these, we select varying amounts of random units: { We could select these through equal allocation, i.e. an equal number form each strata (simple random sampling). { We could use proportional allocation, i.e. random number of units propotional to the strata size. { We may try Neyman (Optimal) Allocation, where each sample is weighted by strata variance. ▯ For a low cost / more e▯cient solution, we may try Cluster Sampling. In this case, the popu- lation is divided into M natural clusters. We take a simple random sample of the clusters to get m clusters, and from these clusters we perform equal allocation. Random Sample A random sample of size n from an in▯nite population is a set of independant and identically distributed random variables. Each random variable has the same probability distribution, mean, and variance. Central Limit Theorem If we have a random sample, then for large values of X (X > 25), we have Z = X ▯ ▯ ▯N(0;1) p▯n Con▯dence Intervals For some random sample, if the probability funciton if X depends on some unknown parameter ▯, then we say an estimator of ▯ is a function of the sample ^ ▯ = h(X 1X 2:::X n ^ An estimator is said to be unbiased if the expected value is ptself, i.e. E(▯) = ▯. The standard deviation, or standard error, of an esitmator is equal to SE(▯)V ar(▯). Note that if we have two estimators for a parameter, the one with the lower standard error will be more e▯cient. A (1▯▯)% con▯dence interval for a parameter ▯ is an observation of the random interval (L(X);U(X)) such that P(L(X) < ▯ < U(X)) = (1 ▯ ▯) 9 Note that in this case it is the ends of the interval L(X) and U(X) which are random, not ▯ itself. This also gives us ▯ probability that the random interval does not contain the true value of the parameter. Consider X N(▯;▯ ), where ▯ is a known value. A (1 ▯ ▯)% con▯dence interval for ▯ is given by ▯ x ▯ z1▯▯ ▯ p 2 n where z▯is the ▯-quantile, i.e. the value such that P(Z ▯ z ) = ▯. The margin of error of a con▯dence interval is the distan
More Less
Unlock Document

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock Document
You're Reading a Preview

Unlock to view full version

Unlock Document

Log In


OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


OR

By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.


Submit