STAT231 Final Exam Review
LTEX er: W. Kong
PPDAC = Problem / Plan / Data / Analysis / Conclusion (See the ﬁnal page for a summary)
Deﬁnition 1.1. The target population is the set of animals, people or things about which you wish to draw
conclusions. A unit is a singleton of the target population.
Deﬁnition 1.2. The sample population is a speciﬁed subset of the target population. A sample is a singleton
of the sample population and a unit of the study population.
Deﬁnition 1.3. A variate is a characteristic of a single unit in a target population and is usually one of the
1. Response variates - interest in the study
2. Explanatory variate - why responses vary from unit to unit
(a) Known - variates that are know to cause the responses
i. Focal - known variates that divide the target population into subsets
(b) Unknown - variates that cannot be explained in the that cause responses
Deﬁnition 1.4. An attribute/parameter(T.P.)/statistic(Sample) is a characteristic of a population which is
usually denoted by a function of the response variate. It can have two other names, depending on the population
Deﬁnition 1.5. The aspect is the goal of the study and is generally one of the following: descriptive, compar-
ative, causative, and predictive.
Note 1. T:P: ▯ S:P: ▯ Sample
Deﬁnition 1.6. Let a(x) be deﬁned as an attribute as a function of some population or sample x. We deﬁne
the study error as
a(T:P:) ▯ a(S:P:):
Deﬁnition 1.7. Similar to above, we deﬁne the sample error as
a(S:P:) ▯ a(sample):
2 Measurement Analysis
The goal of measures is to explain how far our data is spread out and the relationship of data points.
1 2.1 Measurements of Spread
Deﬁnition 2.1. Coeﬃcient of Variation (CV)
This measure provides a unit-less measurement of spread:CV = ▯ ▯ 100%
2.2 Measurements of Association
1. Covariance: In theory (a population), the covariance is deﬁned as Cov(X;Y ) = E((X ▯ ▯ )(Y ▯ ▯ ))
Pn X Y
but in practice (in samples) it is deﬁned as s = i=1 :Note that Cov(X;Y );s 2 R and both
XY n▯1 XY
give us an idea of the direction of the relationship but not the magnitude.
Cov (X;Y )
2. Correlation: In theory (a population), the correlation is deﬁned as ▯XY = ▯X▯Y but in practice (in
samples) it is deﬁned as r = XY : Note that ▯1 ▯ ▯ ;r ▯ 1 and both give us an idea of the
XY sXsY XY XY
direction of the relationship AND the magnitude.
(a) An interpretation of the values is as follows: jXYj ▯ 1 =) strong relationship, jr XY j = 1 =)
perfectly linear relationshipXYjj > 1 =) positive relationship, XY j < 1 =) negative relationship,
jXY j ▯ 0 =) weak relationship
3. Relative-risk: From STAT230, this the probability of something happening under a condition relative to
this same thing happening if the condition is note met. Formally, for two events A and B, it is deﬁned as
RR = P(AjB): An interesting property is that if RR = 1 then A ? B and vice versa.
4. Slope: This will be covered later on.
3 Statistical Models
Recall that the goal of statistics is to guess the value of a population parameter on the basis of a (or more)
3.1 Types of Models
Goal of statistical models: explain the relationship between a parameter and a response variate.
The following are the diﬀerent types of statistical models that we will be examining :
1. Discrete (Binary) Model - either the population data is within parameters or it is not.
2. Response Model - these model the response and at most use the explanatory variate implicitly as a focal
3. Regression Model - these create a function that relates the response and the explanatory variate (at-
tribute or parameter); note here that we assume Y i Y ji.
2 4 Estimates and Estimators
Here, we only review the main ideas of estimates and estimators.
4.1 Maximum Likelihood Estimation (MLE) Algorithm
1. Deﬁne L = f (y ;y ;:::;y ) = f (y ) where we call L a likelihood function. Simplify if possible. Note
1 2 n i=1 i
that f (y ;y ;:::;y ) = Q f (y ) because we are assuming random sampling, implying that y ? y , 8i 6= j.
1 2 n i i j
2. Deﬁne l = ln(L). Simplify l using logarithmic laws.
3. Find @l ; @l;:::;@l, set each of the partials to zero, and solve for each ▯ i i = 1;:::;n. The solved ▯ s
@▯1 @▯2 @n i
are called the estimates of f and we add a hat, ▯ ,ito indicate this.
▯ is the realization (from a sample) of a distribution of estimates. The distribution is called an estimator and
is denoted by ▯.
4.3 Biases in Statistics
Deﬁnition 4.1. We say that for a given estimator, ▯, of an estimate for a model is unbiased if E(▯) = ▯ ~
holds.Otherwise, we say that our estimator is biased.
5 Distribution Theory
We introduce the following new distributions.
▯ If X ▯ N(0;1) then X 2 ▯ ▯ 2which we call a Chi-squared (pronounced “Kai-Squared”) distribution on
one degree of freedom
2 2 2
▯ Let X ▯ ▯ m and Y ▯ ▯ .nThen X + Y ▯ ▯ n+m which is a Chi-squared on n + m degrees of freedom
▯ Let N ▯ N(0;1), X ▯ ▯ , Xv? N. Then q ▯ t vhich we call a student’s t-distribution on v degrees
Properties of the Student’s t-Distribution
▯ This distribution is symmetric
▯ For distribution T ▯ t v when v > 30, the student’s t is almost identical to the normal distribution with
mean 0 and variance 1
▯ For v ▯ 30, T is very close to a uniform distribution with thick tails and very even, unpronounced center
3 5.1 Least Squares Method
There are two ways to use this method. First, for a given model Y and parameter ▯, suppose that we get a
best ﬁt^ and deﬁne ^i= jy^ ▯ i j. The least squares approach is through any of the two
1. (Algebraic) Deﬁne W = ^ . Calculate and [email protected]
to determine ▯.
i=1 i @▯
Pn ▯ ▯
2. (Geometric) Deﬁne W = ^i= ▯^ ^. Note that W ? spanf 1 ; x g and so^▯1 = 0 and ▯^tx = 0. Use
these equations to determine ▯.
Here, we deviate from the order of lectures and focus on the various types of constructed intervals. However,
in this section, I will only provide the formulas and not the motivation.
Name Formula Properties
q If V ar(▯) is known, then C ▯ N(0;1) and if it is unknown, we
Conﬁdence replace V ar(▯) with V ar( ▯)and C n▯q. When ▯ = 5%, which
EST▯cSE = ▯▯c ^ V ar(▯)
Intervals aﬀects our value of c, we are constructing a 95% conﬁdence
Same as above except note that p is diﬀerent from a standard
Predicting EST ▯pcSE = model Y i f (▯) + ▯iin that the ﬁrst component is random (i.e.
Intervals f (▯) ▯ V ar(p ) Yp= f (▯) + p ). When ▯ = 5%, which aﬀects our value of c, we
are constructing a 95% conﬁdence interval.
Likelyhood Solution of R(▯) =L(▯) When computing the solution of R(▯) ▯ 0:1, this will give the
Qn 95% likelyhood interval for ▯. This interval is particularly useful
Intervals where L(▯;y i = f (i ;▯)
i=1 for models that are not necessarily normal
7 Hypothesis Testing
While our conﬁdence interval does not tell us in a yes or no way whether or not a statistical estimate is true, a
hypothesis test does. Here are the steps:
1. State the hypothesis, 0 : ▯ = 0 (this is only an example), called the null hypothes1s (H is called the
alternative hypothesis and states a statement contrary to the null hypothesis).
2. Calculate the discrepancy (also called the test statistic), denoted byd =▯▯0 = estimate▯0 value
V ar(▯) SE
assuming that ▯ is unbiased and the realization of d, denoted by D, is N(0;1) if V ar(▯) is known and
tn▯q otherwise. Note that d is the number of standard deviatio0s ▯ is from ▯.
3. Calculate a p▯value given by p = 2P(D > jdj). It is als