# Summary notes

202 views23 pages
School
UTSG
Department
Statistical Sciences
Course
STA261H1
Professor STA261H1.doc
Page 1 of 23
Statistical Inference
RANDOM SAMPLE
Definition: Random Sample
n
XX ,,
1 is called a random sample from a distribution with pdf
(
)
xf (or pf
(
)
xP) if n
XX ,,
1 are
independent and have identical distribution with pdf
(
)
xf (or pf
(
)
xP). It is often denoted iid
(independent-identically-distributed).
Note
Let n
XX ,,
1 be a sample from a distribution with pdf
(
)
xf. The joint pdf of
(
)
n
XXX ,,
1= is
(
)
(
)
(
)
(
)
(
)
nnnn xfxfxfxfxxf1111 ,== .
SAMPLE MEAN
Definition: Sample Mean and Sample Variance
Let n
XX ,,
1 be a random sample. The sample mean is defined by n
XX
n
X
Xn
n
i
i
n
++
==
=
11 , and the
sample variance is defined by
( )
1
12
=
=
n
XX
S
n
i
ni
.
Theorem
If n
XX ,,
1 are iid each with a
(
)
2
,
σµ
N distribution, then nn XkXk++
11 has a normal distribution
with mean
(
)
µµµ
nn kkkk ++=++ 11 and variance
(
)
2
22
1
σ
n
kk ++.
More generally, if n
XX ,,
1 are independent and each i
X has a
(
)
2
,ii
N
σµ
, then nn XkXk++
11 has a
normal distribution mean nn
kk
µµ
++
11 and variance 222
1
2
1nn
kk
σσ
++.
USEFUL DISTRIBUTION
Theorem
Suppose n
XX ,,
1 is a random sample from
(
)
2
,
σµ
N distribution. Then
n
X and
( )
=
n
i
niXX
1
2 are
independent, and
( )
2
1
2
σ
=
n
i
niXX
has a 2
1n
χ
distribution.
www.notesolution.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 23 pages and 3 million more documents. STA261H1.doc
Page 2 of 23
The t and F Distribution
Two important distributions useful in statistical inference are:
1) If
(
)
1,0~NW and 2
~r
V
χ
are independent, then
r
V
W has a t-distribution.
2) 2
1
~r
U
χ
and 2
2
~r
V
χ
are independent, then
2
1
r
V
r
U
has a F-distribution.
THE CENTRAL LIMIT THEOREM
Let n
XX ,,
1 is a random sample with finite mean
µ
and variance 0
2>
σ
. Let nn XXS++=
1. Then
(
)
( )
(
)
( )
( )
1,0
var
var 2
N
n
X
X
XEX
S
SES
n
n
n
nn
n
nn
=
=
σ
µ
.
STATISTICAL MODEL
Consider a random sample n
XX ,,
1 from a distribution with pdf
(
)
xf
θ
. The family
(
)
{
}
θ
θ
|xf (where
(
)
xf
θ
is pdf (or
(
)
xP
θ
a pf),
θ
is an unknown parameter, is a parameter space) is a statistical model.
We know that the distribution under investigation is in the family, but dont know which one. Based on the
sample values n
xx ,,
1, we find an estimate for
θ
; Once we find
θ
, we know the distribution.
LIKELIHOOD FUNCTION
Definition: The Likelihood Function
Let n
XX ,,
1 be a random sample from a distribution with pdf
(
)
xf
θ
(or pf
(
)
xP
θ
). The likelihood is
defined by R
:L given by
(
)
(
)
(
)
(
)
nnn xfxcfxxcfxxL
θθθ
θ
111 ,,,|== where 0
>
c(or
(
)
(
)
(
)
(
)
nnn xPxcPxxcPxxL
θθθ
θ
111 ,,,|== ).
Definition: The Maximum Likelihood Estimate (MLE)
The function S:
ˆ
θ
is called the maximum likelihood estimator.
(
)
s
θ
ˆ is called the maximum likelihood estimate of
θ
if for each
θ
,
(
)
(
)
(
)
nn xxLxxsL,,|,,|
ˆ11
θθ
.
The Algorithm
This suggests that in order to obtain the MLE of
θ
, we maximum the likelihood function. Since a version of
the likelihood version with 1
=
c gives the same maximum value, we use this version. In most cases, this is
done by differentiation.
1) Write the likelihood function
( ) ( )
=
=
n
i
inxfxxL
1
1,|
θ
θ
.
www.notesolution.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 23 pages and 3 million more documents. STA261H1.doc
Page 3 of 23
2) Write the log likelihood function defined by
( ) ( )( ) ( )
=
==
n
i
inn xfxxLxxl
1
11 ln,|ln,|
θ
θθ
.
3) Write the score function
( )
(
)
θ
θ
θ
=n
n
xxl
xxS,|
,|1
1
.
4) Write the score equation
(
)
0,|1=
n
xxS
θ
and solve for
θ
.
5) Check that the solution is the global maximum. If it is, then it is the MLE of
θ
.
Theorem
If
(
)
n
xx ,
ˆ1
θ
is the MLE in , and
1-1
:
φ
, then the MLE in the new parameterization is
(
)
(
)
(
)
nn xxxx ,
ˆ
,
ˆ11
θθφ
=.
Proof:
(
)
(
)
( )
(
)
( )
( )
(
)
( )
( ) ( )
( )
( ) ( ) ( ) ( )
nnnn
nnn
xx
n
xx
n
xx
nn
xxLxxgxxfxxL
xxxxLxxf
xxgxxgxxxxL
n
nn
,,|,,,,,,|
,,|,,
ˆ
,,
,,,,,,|,,
ˆ
1
*
111
111
,,
ˆ
1
,,
ˆ
1
,,
ˆ
11
*
1
11
θθ
θ
θ
θθ
θ
θφθ
===
==
==
Hence for every
θ
,
(
)
(
)
(
)
nnn xxLxxxxL,,|,,|,,
ˆ1
*
11
*
θθ
, and so
(
)
n
xx ,,
ˆ1
θ
is the MLE
of the new parameterization.
The Algorithm: The Multidimensional Case
In the multidimensional case, the parameter space is
(
)
{
}
1,,,
1>=k
k
θθ
.
1) Write the likelihood function
(
)
(
)
nkxxL,|,, 11
θθ
.
2) Write the log likelihood function defined by
(
)
(
)
(
)
(
)
(
)
nknkxxLxxl,|,,ln,|,, 1111
θθθθ
=.
3) Write the score function
( )( )
(
)
(
)
( )( )
=
k
nk
nk
nkxxl
xxl
xxS
θ
θθ
θ
θθ
θθ
,|,,
,|,,
,|,,
11
1
11
11
.
4) Write the score equation
( )( )
(
)
(
)
( )( )
=
=
0
0
,|,,
,|,,
0,|,,
11
1
11
11
k
nk
nk
nkxxl
xxl
xxS
θ
θθ
θ
θθ
θθ
and solve.
5) Check that the solutions are the global maximum (the matrix of the second partial derivatives evaluated at
(
)
k
θθ
ˆ
,,
ˆ1 must be negative definite, or equivalently, all eigenvalues negative).
STANDARD ERROR AND BIAS
Suppose
θ
ˆ is the MLE;
(
)
(
)
(
)
nn xxxx ,,
ˆ
,,
ˆ11
θφφ
= is the estimate of
(
)
θφ
. How reliable are the estimates?
One measure of accuracy commonly used is MSE (mean squared error).
Definition: Mean Squared Error
www.notesolution.com
Unlock document

This preview shows pages 1-3 of the document.
Unlock all 23 pages and 3 million more documents.