February 4, 2014
Chapter 2 – Stationary Time Series
* Forecasting with optimality
1. The case of normal time series
X 1 …, X ns a time series
An easy-to-implement criterion for forecasting X n+1is to minimize the expected squared loss
In other words, we will find a function f st
E([X n+1– f(1 , …, n )] ) is as small as possible, where f is considered in some class of functions
Theorem
f should be E[X |X , …, X ]
n+1 1 n
ie. f = E[n+1X 1 …, X n a.s.
Proof: Let g be any function of X 1 …, X n
Let g* = E[X n+1,1…, X ]n
E([X n+1 g(x 1 …, xn)] ) = E([(n+1– g*) + (g* – g)] ) = E[(n+1– g*) ] + 2E[(Xn+1– g*)(g* – g)] +
E[(g* – g) ]
# 2E[(X n+1– g*)(g* – g)] = 2E{E[(Xn+1– g*)(g* – g)|X1, …, Xn]}
# (Law of iterated conditional expectation) Since (g* – g) is a function of X1, …, Xn
# = 2E{(g* – g)2[X n+1– g*|X 2 …, X n} = 2E{(g* – g) 0} = 0
= E[(X n+1– g*) ] + E[(g* – g) ]
=> for all g, E[(n+1– g) ] ≥ E[(Xn+1– g*) ]
Theorem
X 1 μ 1 ∑ 11 ∑ 12
If (⃗ ) N ()( , )
X 2 ( μ 2 ∑ 21 ∑ 22)
Def: Let X be a p-dimensional random vector
T T T
We say that X ~ N(µ, Σ) iff a X ~ N(a µ, a Σa) for any p-dimensional constant vector a
(Σ is a pxp matrix, X and µ and a are vectors)
Then the distribution -1 (X1|X2= a) is N(µ, Σ)
where µ = µ 1 Σ Σ12 22 (a - 2 )
Σ = Σ 11Σ Σ12 22Σ 21
Corollary
⃗ µ ⃗
If X1, …, Xnis a normal time series st E[ X n ] = n and Cov[ X n ] = n
X n γn
Further assume that E[X ] n+1 n+1and Cov[X n+1 ] =
X1
X =
Here n …
()Xn
Then the best forecast of X n+1is E[X n+1,nX ,n-1 X ] =1µ n+1+ γn TΣ n1[ X n – µn ]
Proof: Plug in the above Theorem
Observation: Optimal forecast depends critically on the covariance of your time series
2. What if the time series is not normal?
One solution: Think of the best linear forecast
In other words, try to find a function 0 + a1X1+ … + a Xnsn
E([X n+1 (a0+ a 1 +1… + a X n] n** is as small as possible
Find the solution!
Let Y i X i µ i µ i E[X] i
** = E([Y + µ – a – a (Y – µ ) – … – a (Y – µ )] )2
n+1 n+1 o 1 1 2 n n n
= E[(Y n+1– a0* – a1Y1– … – a Yn)n]
where a 0 = a 0 a µ1+1… + a µ –nµn n+1
a 1
Let a= ⋮
(a
n
Y
1
Y n ⋮
()
Yn
a T Y 2
=> ** = E[(Y n+1– a0* - n ) ]
Open the squares
2 2 a T Y 2 a T Y a T Y
** = E[Yn+1] + (a0*) + E[( n ) ] – 2a0*E[Yn+1– 2E[Y n+1 n ] + 2a0* E[ n ]
And to minimize **, a 0 should = 0
⃗ ⃗
** = E[Yn+1] + E[( a T Yn ) ] – 2 a TE[Y n+1 Y n ]
γn ⃗X X
Let = Cov[X ,n+1 n ] and ∑ n Cov[ n ]
By the fact that E[Yi = 0 for all i, we have
Y γ
E[Y n+1 n ] = n and
E[ Y n n T] = Σn
⃗ ⃗ ⃗ ⃗ ⃗
Observe that E[( a T Y n ) ] = E[ a T Y n ( a T Y n ) ] = E[ a T Y n Y n T a ] =
a T Y Y T a a T a
E[ n n ] = Σn
** = E[Yn+1] + a TΣ n a – 2 a T γn
a a
Since ** is a convex function of , differentiate ** wrt and set = 0
∂**/∂ a = Σn a + Σn a – 2 γ n = 2Σ n a – 2 γn = 0 γ
=> a = Σ n1 n
Finally 0 = 0 * =0a +1a1µ + … + n n – n+1
=> a0+ a T µn – µn+1= 0

More
Less