# Computational Genomics Lecture 10 Hidden Markov Models (hmms) Outline Finite, or Discrete, Markov Models

 Sana 13.02.2018 Hajmi 501 b. • ## Q1.: Computing the probability of a given observation.

• A1.: Forward – Backward (Baum Welch) dynamic programming algorithm.

• ## an observation.

• A2.: Viterbi’s dynamic programming Algorithm
• ## Q3.: Learn best model, given an observation,.

• A3.: Expectation Maximization (EM): A Heuristic. • ## A discrete (finite) system:

• N distinct states.
• Begins (at time t=1) in some initial state(s).
• At each time step (t=1,2,…) the system moves
• from current to next state (possibly the same as
• the current state) according to transition
• probabilities associated with current state.

• ## After Andrei Andreyevich Markov (1856 -1922) • ## Biological Relevance • ## The probability to start in a given state i is i , The vector repre-sents these startprobabilities. ## Markov Property ## Markov Chains ## Simple Minded Weather Example ## Simple Minded Weather Example ## Coke vs. Pepsi (a cental cultural dilemma) ## Coke vs. Pepsi ## Coke vs. Pepsi ## Coke vs. Pepsi • ## must hold, yielding p0= 2/3, p1=1/3 . • ## every state. • ## If the Markov chain is positive recurrent, there exists a stationary distribution. If it is positive recurrent and irreducible, there exists a unique stationary distribution, and furthermore the process constructed by taking the stationary distribution as the initial distribution is ergodic. Then the average of a function f over samples of the Markov chain is equal to the average with respect to the stationary distribution, • Pπ = π .
• ## In this case, the stationary distribution π is an eigenvector of the transition matrix, associated with the eigenvalue 1. • ## Problem – given that the weather on day 1 (t=1) is sunny(3), what is the probability for the observation O: • ## The answer is - • ## Ergodic model

• Strongly connected - directed
• path w/ positive probabilities
• from each state i to state j
• (but not necessarily complete
• directed graph) ## Third Example: A Friendly Gambler ## Fourth Example: A Friendly Gambler • ## Our next destination: Hidden Markov chains. ## Hidden Markov Models (probabilistic finite state automata)

• Often we face scenarios where states cannot be
• directly observed.
• We need an extension: Hidden Markov Models ## Hidden Markov Models - HMM ## Example: Dishonest Casino ## Coin-Tossing Example  • ## quality of the model M. Viewed this way, it enables discrimination/selection among alternative models. ## Coin-Tossing Example ## C-G Islands Example • ## In human genome, CG dinucleotides are relatively rare

• CG pairs undergo a process called methylation that modifies the C nucleotide
• A methylated C mutate (with relatively high chance) to a T
• ## Promotor regions are CG rich

• These regions are not methylated, and thus mutate less often
• These are called CpG islands • ## Using maximum likelihood estimates from 60K nucleotide, we get two models • ## Given a sequence X1,…,Xn we compute the likelihood ratio ## Empirical Evalation • ## What do we do when the window intersects the boundary of a CpG island? • ## A transition from a - state to a + describes a start of CpG island ## A Different C-G Islands Model • ## is obtained by summing over all possible state sequences • ## Can this be made more efficient? • ## Why isn’t it efficient? – O(2LQL)

• For a given state sequence of length L we have about 2L calculations
• P(Q|M) = Q1 aQ1Q2 aQ2Q3 aQ3Q4 … aQT-1QT
• P(O|Q) = bQ1O1 bQ2O2 bQ3O3 … bQTOT
• There are QL possible state sequence
• So, if Q=5, and L=100, then the algorithm requires 200x5100 computations
• We can use the forward-backward (F-B) algorithm to do things efficiently • ## A white board presentation. • ## Option 1) The likelihood is measured using any sequence of states of length T

• This is known as the “Any Path” Method
• ## Option 2) We can choose an HMM by the probability generated using the best possible sequence of states

• We’ll refer to this method as the “Best Path” Method • ## Recall that given a model M, a sequence of observations O, and a sequence of states Q, we can efficiently compute P(O|Q,M) (should watch out for numeric underflows) • ## A white board presentation of Viterbi’s algorithm • ## Computing posterior probabilities for “fair” at each point in a long sequence: • ## Question III can be viewed as a learning problem: We want to use the sequence of observations in order to “train” an HMM and learn the optimal underlying model parameters (transition and output probabilities). • ## Nka - number of times hi=k & xi = a • ## Counts are inaccessible since we do not observe hi • ## If we have Akl and Bka we can compute • ## Similarly • ## Reiterate • ## P(x1,…,xn: Akl, Bka)  P(x1,…,xn: A’kl, B’ka)

• Likelihood grows in each iteration
• ## If P(x1,…,xn: Akl, Bka) = P(x1,…,xn: A’kl, B’ka) then Akl, Bka is a stationary point of the likelihood

• either a local maxima, minima, or saddle point • ## Compute forward and backward messages

• Time & Space complexity: O(nL)
• ## Accumulate expected counts • ## We often do not know how many hidden values we should have or can learn ## Communication Example Do'stlaringiz bilan baham:

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2019
ma'muriyatiga murojaat qiling