# Computational Genomics Lecture 10 Hidden Markov Models (hmms) Outline Finite, or Discrete, Markov Models

 Sana 13.02.2018 Hajmi 501 b.

• ## Q1.: Computing the probability of a given observation.

• A1.: Forward – Backward (Baum Welch) dynamic programming algorithm.

• ## an observation.

• A2.: Viterbi’s dynamic programming Algorithm
• ## Q3.: Learn best model, given an observation,.

• A3.: Expectation Maximization (EM): A Heuristic.

• ## A discrete (finite) system:

• N distinct states.
• Begins (at time t=1) in some initial state(s).
• At each time step (t=1,2,…) the system moves
• from current to next state (possibly the same as
• the current state) according to transition
• probabilities associated with current state.

• Pπ = π .

• ## Ergodic model

• Strongly connected - directed
• path w/ positive probabilities
• from each state i to state j
• (but not necessarily complete
• directed graph)

## Hidden Markov Models (probabilistic finite state automata)

• Often we face scenarios where states cannot be
• directly observed.
• We need an extension: Hidden Markov Models

• ## In human genome, CG dinucleotides are relatively rare

• CG pairs undergo a process called methylation that modifies the C nucleotide
• A methylated C mutate (with relatively high chance) to a T
• ## Promotor regions are CG rich

• These regions are not methylated, and thus mutate less often
• These are called CpG islands

• ## Why isn’t it efficient? – O(2LQL)

• For a given state sequence of length L we have about 2L calculations
• P(Q|M) = Q1 aQ1Q2 aQ2Q3 aQ3Q4 … aQT-1QT
• P(O|Q) = bQ1O1 bQ2O2 bQ3O3 … bQTOT
• There are QL possible state sequence
• So, if Q=5, and L=100, then the algorithm requires 200x5100 computations
• We can use the forward-backward (F-B) algorithm to do things efficiently

• ## Option 1) The likelihood is measured using any sequence of states of length T

• This is known as the “Any Path” Method
• ## Option 2) We can choose an HMM by the probability generated using the best possible sequence of states

• We’ll refer to this method as the “Best Path” Method

• ## P(x1,…,xn: Akl, Bka)  P(x1,…,xn: A’kl, B’ka)

• Likelihood grows in each iteration
• ## If P(x1,…,xn: Akl, Bka) = P(x1,…,xn: A’kl, B’ka) then Akl, Bka is a stationary point of the likelihood

• either a local maxima, minima, or saddle point

• ## Compute forward and backward messages

• Time & Space complexity: O(nL)

## Communication Example

Do'stlaringiz bilan baham:

Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2019
ma'muriyatiga murojaat qiling