false

Study Guides
(248,644)

Canada
(121,651)

University of Waterloo
(5,732)

Statistics
(161)

STAT 206
(2)

Eddie Dupont
(2)

Description

STAT 206 - Statistics for Engineers
Kevin James
Fall 2013
Introduction
Statistics in the collection, organization, analysis, iterpretation, and presentation of data. In e▯ect, it
is a quanti▯cation of uncertainty.
Process
To conduct an empirical study or statistical study, we must ▯rst identify the population
population: the set of elements your query pertains to
of interest. Individual items of this population are called units
units: a single element, usually a person or object, whose characteristics we are interested in
We also de▯ne the hypothesis, or question we would like answered.
We select a subset of units from the population to have in our sample
sample: a subset of the population from which measurements are actually made
which must have a pre-determined size and should make an attempt to reduce or eliminate sample
error
sample error: an error which occurs randomly due to the uncertainty of the sample
We also must determine how we can measure the variable of interest
variable of interest: a measure of the interesting characteristic of a unit
This variable can often be measured in a multitude of ways, though many of which will be somewhat
lacking in value. You must take into account not only what this variable is and how it is collected, but
also ways to minimize bias, such as by randomizing and repeating your experiments.
We should also attempt to avoid study errors
study errors: systematic errors which occur because the sample does not accurately re
ect the
population
1 or else we will ▯nd ourselves with a large amount of error and/or uncertainty.
Post-experiments, we need to analyze our data and come to a conclusion. It is generally a goo didea to
graph the data, as this gives us a highly visual method of analysis. We can use two main branches of
statistics to analysyze our data: descriptive statistics
descriptive statistics: a summary of the collected data, both visuallt and numerically
or inferential statistics
inferential statistics: generalized results for the population based on the sample data
We will be focusing on inferential statistics, which inlude a quanti▯cation of uncertainty, in this course.
Finally, we use the results of our study to answer the original hypothesis or research question. We also
must be sure to address the limitations of our study.
Types of Variables
Our variables may be either categorical
categorical: a qualitative measure belonging to one of K possible classes
discrete
discrete: a quantitative measure with some countable value
or continuous
continuous: a quantitative measure with some uncountable value, such as a range of values
Plots
We can design a stem-and-leaf plot by writing all ▯rst digits in a single column and all of the other
digits in the corresponding right-hand side. For example, for a standard bell-curve grading scheme:
4 | 24
5 | 0068
6 | 24556
7 | 4556678889
8 | 00022223334558
9 | 0334469
We can also use grouped frequency tables by using frequency bins, for example
Average Frequency
90+ 18
80+ 43
70+ 87
60+ 92
Histograms follow a similar pattern, since we select bins such as 40-49, 50-59, 60-69, 70-79, 80-89,
90-100 and diagram the amount in each bin. If we have di▯erently sized bins (e.g. 1, 2, 3-4) we want to
examine the "area" of the bars instead of their "height".
2 Measures of Certainty
The sample mean is a set of n values and is denoted
X n
x
i
x = i=1
n
The median is the number x such that half the values are below and half are above. If we denote the
ithsmallest value as x , then
i ▯
x = x n+1
2
if n is odd, or
xn+2 + xn
x = 2 2
2
if n is even.
Measures of Dispersion
The sample variance of a set of n values is denoted by
Xn
(xi▯ x)2
s = i=1
n ▯ 1
The standard deviation, denoted s, is the square root of the sample variance.
The range of the set is the di▯erence between the maximum and minimum values.
If we create a graph with the median, mean, average variance, etc, it is called a box-and-whiskers
plot.
Probability
Classical probability is the "common sense" probability related to discrete events such as coin
ips,
dice rolls, etc. Though useful, this form of probability has some severe limitations: namely, the de▯nition
of what "equally likely" actually means. In e▯ect, we can use this type of probability to ▯nd an answer,
but can not use that answer for anything. Relative frequency probability is slightly more useful:
we repeat an experiment some number of times and record the relative chance of various outcomes.
This type of probability analysis, however, is extremely impractical. Finally, we have subjective
probability, which is based on a person’s experiences and subjective knowledge. Obviously, this
method also has some severe limitations and is far too abstract to be used scienti▯cally.
When discussing probabilty, we always refer to experiments
experiments: a repeatable phenomenon or process
or their various trials
3 trials: an iteration of an experiment
These experiments have a sample space
sample space: set of discrete outcomes for an experiment
which is obviously either discrete or continuous, depending on whether or not this range is countable.
We will be attaching a mathematical model to the sample space to have our our de▯nition of probability.
Any probablity model must obey the following axioms:
▯ 0 ▯ P(A) ▯ 1;A 2 S
▯ P(S) = 1
▯ P(A [ B) = P(A) + P(B) for any mutually exclusive outcomes
for any sample space S and potential outcomes A and B.
The classical model would suggest that for a sample set S = fa;b;cg, each outcome has a probability
P = .3This is referred to as a uniform distribution, and is incorrect for most non-trivial samples.
Permutations and Combinations
A common problem requires we create an arrangement using r of n objects. In such a set, the number
of permutations is equal to
(r) n!
n =
(n ▯ r)!
If we don’t care about the order of the arrangement, we can use the formula for a combination. The
way to ▯nd r of n items is
▯ ▯
n = n!
r r!(n ▯ r)!
Set Operations
A \ B is the intersection of two events, event A and event B. In other words, this is the probability
that both events will occur. It is also written as AB. Note that if P(A \ B) = 0, the two events are
mututally exclusive.
P(A [ B) is the union of events A and B, and is de▯ned by P(A [ B) = P(A) + P(B) ▯ P(\B). This
is the probability of one or the other happening.
We also de▯ne the complement of A, A = 1▯P(A). This is the probability of the event not occuring.
We de▯ne conditional occurances P(A\B)he following notation: the probability of A conditional on
the probability of B is P(AjB) = P(B) . Obviously, if the probability of B is zero, this is non-sensical.
Two events are independant if and only if P(A\B) = P(A)P(B). Note that this will also tell us that
P(AjB) = P(A) and vice-versa (the probability of A/B is the same regardless of whether the other has
occured.
4 Law of Total Probability
If we have some distinct partition of our sample set su0h th1t B [ n [ ▯▯▯B , then for any event A
Xn
we can ▯nd P(A) = P(AjB iP(i)
i=0
Bayes’ Theorem
For any two events in a sample set
P(AB) P(AjB)P(B)
P(BjA) = =
P(A) P(AjB)P(B) + P(AjB)P((B)
Discrete Random Variables
A random variable is one which may have any value, where R(X) is the range of possible values it can
take. We denote random variables with upper-case letters and denote observed variables as lower-case
letters. If these variables can take on only two possible values, we refer to them as binary.
We denote the probability distribution (i.e. the chance of some random variable being equal to a
certain variable) as f(x) = P(X = x). The sum of the probability distributions of X for all possible x
is equal to 1.
We also de▯ne the cumulative distribution function as F(x) = P(X ▯ x).
The mean or expected value of a random variable X is de▯ned as
X
▯ = E(X) = xf(x)
x
This function is linear, thus we have
E(aX + bY ) = aE(X) + bE(Y )
Variance is the square of the expected di▯erence
X
2 2
V ar(X) = E((X ▯ E(X)) ) = f(x)(x ▯ ▯)
x
We sometimes denote this as V ar(X) = E(X ) ▯ E(X) .
Bernoulli Distributions
A Bernoulli distribution will be formed when an experiment is repeated several times. The outcomes of
each tril must be independant though the probability of any given outcome must be identical over all
experiments. Results must be binary.
5 We say that X follows a Bernoulli distribution (X Bernoulli(p)), where p is the probablity of success,
if (
p if x = 1
f(x) =
1 ▯ p if x = 0
Note that for all Bernoulli distributions E(X) = p and V ar(X) = p(1 ▯ p).
Let X be the number of successes obtained from a sequence of n Bernoulli trials. X follows a Binomial
distribution (X Bino(n;p)) if
▯ ▯
n x n▯x
P(X = x) = f(x) = x p (1 ▯ p)
We also have E(X) = np and V ar(X) = np(1 ▯ p).
When solving for a Bernoulli distribution, we may ▯nd that we give ourselves arti▯cial boundaries. For
example, if we have n = 24, we can not solve for p > 24. In this case, we can use limits to ▯nd the
p1
correct answer (i.e. n = 24z, pz=).
Binomial Theorem
For any positive integer n and real numbers a;b,
Xn ▯n▯
(a + b) = a bn▯x
x
x=0
Poisson Process
In a Poisson Process, events occur randomly in time or space according to the following conditions:
▯ Independence - the number of events in disjoint (i.e. non-overlapping) intervals are independant
▯ Individuality - events occur singlely (i.e. no two events can occur at a time)
▯ Homogeneity - events occur according to a uniform (constant) rate or intensity (▯)
If events occur with an average rate of ▯ per unit of time and X is the number of events which occur
in t units of time, then X Poisson(▯t) gives us
e▯▯(▯t)x
f(x) =
x!
for any x = 0;1;2;▯▯▯.
We can also de▯ne E(X) = V ar(X) = ▯t.
Hypergeometric Distribution
If we have a collection of N objects which can be sorted into two distinct types (success and failure),
there exist r successes and N ▯r failures. Assume we select a sample of n objects without replacements.
6 Then let X be the number of successes selected. X is said to follow a hypergeometric distribution
X Hyper(N;r;n) if ▯ ▯▯ ▯
r N ▯ r
x n ▯ x
f(x) = ▯ ▯
N
n
nr nr(N▯r)(N▯n)
for any x = 0;1;▯▯▯ ;min(r;n). We can also de▯ne E(X) = N and V ar(X) = N (N▯1) .
Geometric Distribution
In this case, Bernoulli trials are repeated until the ▯rst success. For X is the number of independant
Bernoulli(p) until the ▯rst success, then X Geometric(p) if
x▯1
f(x) = p(1 ▯ p)
for any x = 1;2;3;▯▯▯
1 1▯p
We also de▯ne E(X) = p and V ar(X) = p2 .
Continuous Probability
The sample space for continuous probability is all open intervals. In e▯ect, the probability of an
event occuring within a given interval is proportionate to the length/size of that interval (for uniform
distributions).
Note that since ranges have probability (e.g. in total range 100, the sub-range 1 ▯ 8 has probability
8▯1 = :07), individual elements must have 0 probability.
100
The probablity density function of a continuous random variable X describes the probability that
X takes on a value in the range (a;b)
Z b
f(x) ▯ dx = P(a < X < b)
a
The cumulative density function is the probability that X < x, or F(x) = P(X < x).
Note that these two density functions are related in the following way:
Z b
f(x) ▯ dx = P(a < X < b)
a
= P(X < b) ▯ P(X < a)
= F(b) ▯ F(a)
or in other words: pdf(a;b) = cdf(b) ▯ cdf(a).
7 Uniform Distribution
A continuous random variable is said to have a uniform distribution if the probability of a given subin-
terval is proportional to the length of that interval. If X is uniformly distributed over (a;b) we write
X Unif(a;b), which gives us f(x) =1 .
b▯a
Normal Distribution
A random variable X has a normal distribution X N(▯;▯ ) if the pdf takes the form
1 ▯2(▯▯)2
f(x) =p 2e
2▯▯
where the expected value is ▯ and the variance is ▯ .
If X has a normal distribution, then Z =▯ = N(0;1). Also note that the normal distribution is
▯
symmetrical, i.e. P(Z > z) = P(Z < ▯z).
To solve a normal distribution, we reduce it to P(Z < x), where x 2 R, and look up the answer in a
normal distribution table.
Binomial Distribution
For any X Binomial(n;p), if np > 5 and n(1 ▯ p) > 5 (i.e. n is large and p is near 0:5), we have
Z = p X ▯ np ▯N(0;1)
np(1 ▯ p)
Because we are using a continuous distribution to approximate a discrete distribution, we include a
continuity correction: ▯ ▯
(a + :5) ▯ np
P(a < X) ▯ P p < Z
np(1 ▯ p)
▯ ▯
(b ▯ :5) ▯ np
P(X < b) ▯ P Z < p
np(1 ▯ p)
Sampling
Statistical Interference
The goal of statistical inference is to draw conclusions about a population, given only a small sample
of said population. We achieve a random subset of a population using random sampling.
8 Random Sampling
There are many common types of random sampling:
▯ For Simple Random Sampling, each unit in the population has the same chance of being
selected.
▯ If we have distinct groups, Strati▯ed Random Sampling may be an excellent option. .We ▯rst
divide the population into K distinct samples; from these, we select varying amounts of random
units:
{ We could select these through equal allocation, i.e. an equal number form each strata
(simple random sampling).
{ We could use proportional allocation, i.e. random number of units propotional to the
strata size.
{ We may try Neyman (Optimal) Allocation, where each sample is weighted by strata
variance.
▯ For a low cost / more e▯cient solution, we may try Cluster Sampling. In this case, the popu-
lation is divided into M natural clusters. We take a simple random sample of the clusters to get
m clusters, and from these clusters we perform equal allocation.
Random Sample
A random sample of size n from an in▯nite population is a set of independant and identically distributed
random variables. Each random variable has the same probability distribution, mean, and variance.
Central Limit Theorem
If we have a random sample, then for large values of X (X > 25), we have
Z = X ▯ ▯ ▯N(0;1)
p▯n
Con▯dence Intervals
For some random sample, if the probability funciton if X depends on some unknown parameter ▯, then
we say an estimator of ▯ is a function of the sample
^
▯ = h(X 1X 2:::X n
^
An estimator is said to be unbiased if the expected value is ptself, i.e. E(▯) = ▯. The standard
deviation, or standard error, of an esitmator is equal to SE(▯)V ar(▯). Note that if we have two
estimators for a parameter, the one with the lower standard error will be more e▯cient.
A (1▯▯)% con▯dence interval for a parameter ▯ is an observation of the random interval (L(X);U(X))
such that
P(L(X) < ▯ < U(X)) = (1 ▯ ▯)
9 Note that in this case it is the ends of the interval L(X) and U(X) which are random, not ▯ itself.
This also gives us ▯ probability that the random interval does not contain the true value of the parameter.
Consider X N(▯;▯ ), where ▯ is a known value. A (1 ▯ ▯)% con▯dence interval for ▯ is given by
▯
x ▯ z1▯▯ ▯ p
2 n
where z▯is the ▯-quantile, i.e. the value such that P(Z ▯ z ) = ▯.
The margin of error of a con▯dence interval is the distan

More
Less
Unlock Document

Related notes for STAT 206

Only pages 1,2,3,4 are available for preview. Some parts have been intentionally blurred.

Unlock DocumentJoin OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.