CSE 150 Lecture Notes - Lecture 24: Reinforcement Learning, Product Rule

54 views3 pages
HMMs
1) How to compute likelihood P(O1,...,OT)? O(n2T)
2) HOw to decode {S1*,...,Sn*} = argmaxS1,...,ST P(S1,...,ST|O1,...,OT)? O(n2T)
3) How to update beliefs in real time? P(St = 1|O1,..,Ot)? O(n2) per time step
4) Learning HMMs
Given sequence(s) of observations {O1,...,OT}
Goal: estimate {π,aij,bil} to maximize P(O1,...,OT)
Assume the # of hidden states is known in advance, fix n
EM algorithm
CPTs to estimate:
πi = P(S1 = i)
aij = P(St+1 = j|St = i) ← transition matrix
bik = P(Ot = k|St = i) ← emission matrix
E-step: compute posterior probs
P(S1 = i|O1,...,OT)
P(St = i,St+1 = j|O1,...,OT)
P(St = i,Ot = k|O1,...,OT) = I(Ot,k) P(St = i|O1,...,Ok) ← term vanishes if k is not observed
at the t’th
observation!
Assume we can compute the above “efficiently”
M-step in HMMs: (for one observation sequence)
πiP(S1 = i|O1,...,OT)
aijΣt P(St = i, St+1 = j|O1,...,OT)
Σt P(St = i|O1,...,OT)
bikΣt I(Otk) P(St = i|O1,...,OT)
Σt P(St = i|O1,...,OT)
How to compute posterior probs in E-step? (i.e. “efficient” inference in HMMs?)
Recall ɑit def P(O1,...,Ot,St = i)
n x T matrix Aside: ɑi1 = πi bi(O1)
← T → ɑj,t+1 = Σi=1
n ɑit aij bj(Ot+1)
|
n
|
Define: βit def P(Ot+1,Ot+2,...,OT|St = i) starts at Ot+1!!!
conditioned on St = i
Recursion for computing β-matrix
(1) base case
(a) βiT = P( -- |ST = i) = 1 for ALL i
Unlock document

This preview shows page 1 of the document.
Unlock all 3 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Assume the # of hidden states is known in advance , fix n. I = p(s 1 = i) a ij = p(s t+1 = j|s t = i) transition matrix b ik = p(o t = k|s t = i) emission matrix. P(s t = i,o t = k|o 1 ,,o t ) = i(o t ,k) p(s t = i|o 1 ,,o k ) term vanishes if k is not observed at the t"th observation! I p(s 1 = i|o 1 ,,o t ) a ij t p(s t = i, s t+1 = j|o 1 ,,o t ) b ik t i(o t k) p(s t = i|o 1 ,,o t ) How to compute posterior probs in e-step? (i. e. efficient inference in hmms?) Recall it def p(o 1 ,,o t ,s t = i) n x t matrix. Conditioned on s t = i n it a ij b j (o t+1 )

Get access

Grade+20% off
$8 USD/m$10 USD/m
Billed $96 USD annually
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
40 Verified Answers
Class+
$8 USD/m
Billed $96 USD annually
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
30 Verified Answers

Related Documents