Markov Chains Camille Crumpton, Sisi Xiong, Tyler McDaniel


Download 119.31 Kb.

Sana13.02.2018
Hajmi119.31 Kb.

Markov Chains

Camille Crumpton, Sisi Xiong, Tyler McDaniel



Questions

• What is the matrix A given the following graph when calculating 

PageRank according to the simple definition of PageRank?

0

1



3

2

• What is the first name of the man credited for Markov Chains?



• What is one application that you have used personally of Markov 

Chains?


Tyler McDaniel

• From Asheville, NC

• CS PhD student; 

work on high-perf 

dense linear algebra 

libraries at ICL

• Prior work with 

graph algorithms in 

the financial sector

• Also worked in 

microcontrollers/IoT


Sisi Xiong

• Changde, Hunan, China

• PhD student majoring in Electrical Engineering.

• Research interests: processing large datasets using probabilistic 

approaches: Bloom filters 


Camille Crumpton

• Hometown: Knoxville/Maryville, TN



Camille Crumpton

• Computer Science graduate student

• Harpist

• Endurance athlete



Outline

• Overview of Markov Chains

• History of Markov Chains

• PageRank – Most Famous Use of Markov Chains

• Algorithm

• Implementation

• Experiments

• Text Algorithm

• Algorithm

• Implementation

• Applications – “Fun” uses!

• Experiments

• More applications

• Population processes

• Adaptive cruise control

• Open Problem

• Discussion


Overview of Markov Chains

• What is a Markov Chain?

• A discrete-time stochastic process that satisfies the Markov 

property


• Then what is the Markov property?

• If one can make predictions about the future of the process 

with only having the knowledge of the present state (no 

knowledge of past states)

• Sometimes called “memoryless” property


Overview of Markov Chains

• Often represented as a graph with probabilities as 

weights

• In the example to the right, each edge weight 



represents the probability of the Markov process 

changing from one state to another

• Can think of a Markov Chain as a stochastic 

(probability-driven) finite state machine



Overview of Markov Chains

Overview of Markov Chains

• Can also be represented as a transition matrix



1

2

3

4

1

0.6


0.2

0

0.2



2

0.6


0.4

0

0



3

0

0.3



0.4

0.3


4

0.3


0

0

0.7



Overview of Markov Chains: Simple Example

• Let’s use a Markov Chain to make a simple prediction of the weather:

• Raining today

40% rain tomorrow

60% sunny tomorrow

• Sunny today

20% rain tomorrow

80% sunny tomorrow



Overview of Markov Chains: Simple Example

http://setosa.io/ev/markov-chains/



Overview of Markov Chains: More Complexity

History behind Markov Chains

Andrey Andreyevich Markov

• Born: June 1856

• Death: July 1922 (66 years old)

• Russian mathematician

• Alma mater: St. Petersburg University

• Asked to stay at St. Petersburg University as a 

researcher upon graduation 

• Studied stochastic processes



Markov Chain Beginnings: Probability + Poetry

• Andrey Markov spent hours perusing through 

Alexander Pushkin’s novel in verse Eugene Onegin

• Discovered patterns in certain letters following 

other letters

• Created an analysis of vowel and consonant patterns 

– the precursor to Markov Chains

http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain



Markov Chain Beginnings: Probability + Poetry

• ”If the current letter I’m reading is a vowel, what is 

the probability that the next letter I’m reading is a 

vowel? 


• A consonant? 

• What about three letters later? 

• Ten letters later?

http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain



Markov Chain Beginnings: Probability + Poetry

"My uncle's shown his good intentions

By falling desperately ill;

His worth is proved; of all inventions

Where will you find one better still?

He's an example, I'm averring;

But, God, what boredom -- there, unstirring,

By day, by night, thus to be bid

To sit beside an invalid!

Low cunning must assist devotion

To one who is but half-alive:

You puff his pillow and contrive

Amusement while you mix his potion;

You sign, and think with furrowed brow --

'Why can't the devil take you now?' "

http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain



Markov Chain Beginnings: Probability + Poetry

• Andrey Markov created a new branch of probability 

with his findings, now known as Markov chains

• Extended probability beyond coin flipping & dice 

rolling (independent events) – to chains of linked 

events (what happens next depends on current 

state of the system)

http://www.americanscientist.org/issues/pub/first-links-in-the-markov-chain



Markov Chain Beginnings: Probability + Poetry

Markov Chains: Two Algorithms/Applications

• PageRank

• Application – we use it everyday!

• Algorithm

• Implementation

• Experiments

• Markov Chain Text Algorithm (a.k.a. Markov Chain Algorithm)

• Algorithm

• Implementation

• Applications

• Experiment


PageRank: Introduction

• Inputs


• A bunch of webpages (vertices), each of which has links to other 

webpages(edges).

• Directed graph!

A



C



C



C

D



E

D

E



PageRank: Introduction

• Purpose

• Rank all webpages in terms of “importance”.

• How to define “importance”

• Analogy: citation, papers with more citations 

are more important.

• Option: count how many backlinks a webpage has.

• Caveat: if a page has a backlink from an “important” page, it also should be 

somewhat important. 

• Weighted directed graph!



PageRank: Introduction 

• Original Paper

• Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation 

ranking: Bringing order to the web. Stanford Info Lab.

• PageRank works by counting the number and quality of links to a 



page to determine a rough estimate of how important the website is. 

The underlying assumption is that more important websites are likely 

to receive more links from other websites.*

*https://web.archive.org/web/20111104131332/https://www.google.com/competition/howgooglesearchworks.html

PageRank: Definition

• Intuitively

• A webpage has high rank if the sum of the ranks of its backlinks is large.

• Simple definition

• u: a webpage

• R(u): rank of u

• F(u): the set of webpages that u points to.

• B(u): the set of webpages that point to u.

• N(u): the number of links from u (forward links).

• c: a factor for normalization, i.e., a constant.

?????? ?????? = ??????  

??????∈??????

??????

??????(??????)



??????

??????


PageRank: Definition

• Construct a n by n matrix A:

??????


??????,??????

=  1/??????

??????

0

• R = [R(1), R(2), …, R(n)]



• R = cAR

• R is an eigenvalue of A with eigenvalue c.

?????? ?????? = ??????  

??????∈??????

??????

??????(??????)



??????

??????


, if there is an edge from v to u

, otherwise



PageRank: Example

??????


0

??????


1

??????


2

??????


3

=

0



0

0 1


1/2

0

0 0



1/2 1/2 0 0

0

1/2 1 0



×

??????


0

??????


1

??????


2

??????


3

0

1



3

2

??????



??????,??????

=  1/??????

??????

0

, if there is an edge from v to u



, otherwise

??????


0

= 1 × ??????

3

??????


1

=

1



2

× ??????


0

??????


2

=

1



2

× ??????


0

+

1



2

× ??????


1

??????


3

=

1



2

× ??????


1

+ 1 × ??????

2


PageRank: Rank sink issue

• Loop (u, v) accumulates rank, but never distributes rank (no forward 

links).

• Modified PageRank



• E(u) is some vector that is considered as a source of rank

u

v



w

??????


?????? = ??????  

??????∈??????

??????


??????

(??????)



??????

??????


+ ????????????(??????)

PageRank: Random Surfer Model 

• A “random surfer” keeps clicking on successive links randomly.

• If entering a rank sink, the surfer “gets bored” and jumps to a random 

page. 


??????

?????? = ??????  



??????∈??????

??????


??????

(??????)



??????

??????


+

1 − ??????

??????

, c = 0.85



u

v

w



x

PageRank: Markov chain

• The sum of values in each column in A is 1.

• A is a Markov matrix!

• All probability is non-negative, it’s guaranteed there is a steady-state 

vector. 

• PageRank always has a steady state.

??????

0

??????



1

??????


2

??????


3

=

0



0

0 1


1/2

0

0 0



1/2 1/2 0 0

0

1/2 1 0



×

??????


0

??????


1

??????


2

??????


3

PageRank: Convergence properties

• Experiments on a database based on 322 million links

• 52 iterations.

• Scale very well. 



PageRank: Implementation 

• R(0) = [1/N, 1/N, …, 1/N]

• e = 0.0001

• delta = 0

• while true

??????



??????+1

= 0.85????????????

??????

+

0.15



??????

• delta =  ??????

??????+1

− ??????


?????? 1

• if delta < e

• break



??????



??????

= ??????


??????+1

• return ??????

??????


PageRank: Experiment configuration

• Datasets

• Test graphs from homework 2.

• Several graphs from 



http://snap.stanford.edu/

.

• Metrics

• Convergence speed, in terms of iterations.

• PageRank results: sparse vs dense graphs.

• PageRank results based on real graphs.


PageRank: Experiment results

Iterations to convergence of all test graphs from hw2



E

V

Observation: 

1. Results of all graphs converge in less 

than 20 iterations.

2. No obvious pattern between 

convergence speed and graph size



PageRank: Experiment results

PageRank results comparison between dense and sparse graphs

Observation: PageRank of results of dense graphs are more spread out. 


PageRank: Experiment results

• Why there is a jump there?

• Suppose if a webpage (vertex) has no backlinks (no webpage has links 

to it), the only rank source comes from “random jumps”, which is a 

relatively small probability.

• Hypothesis: If a vertex has no backlinks, its page rank is the smallest. 

• Experiment results: matching!

Graph

Random.500.01.txt

Random.1000.01.txt

Number of webpages which

have smallest PageRank

101


104

Number of webpages which 

have no backlinks

101


104

PageRank: Experiment results

http://snap.stanford.edu/data/wiki-Vote.html

http://snap.stanford.edu/data/cit-HepTh.html

Citation graph from the e-print arXiv

V = 27,770, E = 352,807

iterations = 17

Wikipedia administer vote graph

V = 7,115, E = 103,689

iterations = 17

Number of vertices which have 

no backlinks = 4734

Number of vertices which have 

no backlinks = 4590


Next Algorithm/Application! Markov Chain Text

Markov Chain Text (Intro)

• Process input text, analyzing letter/word (token) probabilities

• Create graph (transition matrix) to represent those probabilities

• Generate “similar” text using graph

• Order: how many prior tokens to use when generating next token

• Example:

• Input: Foo bar foo bar bar

Foo


Bar

\n

1



.33

.33


.33

Markov Chain Text (Algorithm)

• Simple (bare bones!) pseudocode for order 1:

• PROCESS:

Create dictionary

For line in input:

For token_1, token_2 in line:

Add to dictionary -> {token_1, token_2}

• GENERATE:

Choose first word from dictionary -> token_1

For each desired word in output:

Lookup token_2 in dictionary using token_1

token_2 -> token_1



Markov Chain Text (Implementation)

• Store probabilities for each word based on order; use transition 

matrix rather than dictionary.

• Gather metadata and use parsing rules to format output correctly 

(punctuation, titles before names, etc.)

• May be complemented by other methods…n-gram methods for 

partial words, “fat-finger” grace techniques


Markov Chain Text (Applications)

• One method used in autocomplete software

• Also used in some bots

https://filiph.github.io/markov/



Applications – Autocomplete on Phones

Alternative caption: ”Although the Markov-chain text model is still rudimentary, it recently gave 

me “Massachusetts Institute of America”. Although I have to admit it sounds prestigious. 


Applications – Autocomplete on Phones

• Most text messaging applications use a Markov Chain model to 

predict what word you wish to type next, based on the last word. 


Applications – Subreddit generation

https://www.reddit.com/r/



SubredditSimulator/comme

nts/3g9ioz/what_is_rsubre

dditsimulator/

• Fully-automated subreddit

that generates random 

submissions and comments 

using Markov chains


Applications – Subreddit generation

• Explanation of the Markov Chain on Reddit:

• “Basically, you feed in a bunch of sentences, and even though it has no 

understanding of the meaning of the text, it picks up on patterns like "word A 

is often followed by word B". “

• “Then when you want to generate a new sentence, it "walks" its way through 

from the start of a sentence to the end of one, picking sequences of words 

that it knows are valid based on that initial analysis. “

• “So generally short sequences of words in the generated sentences will make 

sense, but often not the whole thing.”



Markov Chain Text (Experiment)

• Writing a sonnet using Markov chain generated using Billy Shakespeare’s extant sonnets:

• C++ (Processed: .05 seconds, generated 100 lines: .91 seconds)

My days are

In thy wrong, 

that beauty tempting her pretty wrongs that thereby beauty's form in this.

Black lines of that I better state out of thine

in their wills count the lesson true,

the other mine, 

thou live twice in lease

find true that wild music sadly?

sweets dost deceive.



Markov Chain Text (Experiment)

• Writing a sonnet using Markov chain generated using Billy Shakespeare’s extant sonnets:

• Python (Processed: .51 seconds, generated 100 lines: .71 seconds)

Hence, thou wouldst thou thy might

To bear greater wrong, than my invention quite

Dulling my friend

O for thy soul's imaginary sight

Presents thy dial's shady stealth mayst in their virtue answer Muse, 

Make answer, this fair were born

And dost beguile the conquest of heaven it hath masked not false to ruminate

That on the world, unbless some other write to thy monument

When that thou age black ink my transgression bow



Markov Chain Text (Experimental Results)

0

0.2



0.4

0.6


0.8

1

1.2



0

10000


20000

30000


40000

50000


Processing time (seconds) function of word count, 

C++ and Python

C++

Python

0

1



2

3

4



5

6

0



10000

20000


30000

40000


50000

Generation time (seconds) function of word count, 

C++ and Python

C++

Python

Results from Haswell i7, OS X El Capitan



Markov Chain Text (Some Conclusions)

• Implementations were simple, sequential programs modified from 

existing public github repos. 

• C++ version processed text much more quickly, as expected.

• Python version outperformed in generating text; why?

• Implementation detail; script stores processed text into local SQL dictionary.

• C++ version operates on vectors of strings


Markov Chains: More Applications

• Since the next state in a Markov chain is simply a function of the last 

state and a random variable, we can easily see applications for 

Markov chains…

• Queue lengths in call centers

• Stresses on materials

• Waiting time in production facilities

• Inventories in supply chains

• Water levels in dams

• Stock prices

• … and more


Population Process: Introduction

• Markov process

• State: number of individuals in a population

• Changes: addition or removal of individuals

• Applications

• Ecology

• Telecommunications 

• Queueing theory 

• Chemical kinetics

• An example: Birth-death process



• A special case of continuous-time Markov process

• Two types of changes

• Birth

• Death


Population process: Birth-death process

0

1



2

3

4



5

??????


0

??????


2

??????


3

??????


4

??????


1

??????


1

??????


3

??????


4

??????


5

??????


2

??????



??????

Birth rate:

Death rate: ??????

??????


• Pure birth process

• Poisson process

Population process: Birth-death process

0

1



2

3

4



5

??????


0

??????


2

??????


3

??????


4

??????


1

0



1

2

3



4

5



??????

??????


??????

??????


??????

Population dynamics: Birth-death process

??????


??????

?????? + ℎ = ??????

??????

?????? 1 − ??????



??????

ℎ − ??????

??????

ℎ + ??????



??????−1

?????? ??????

??????−1

ℎ + ??????

??????+1

?????? ??????

??????+1

ℎ + ??????(ℎ)

time

Popula


ti

on

t



t+h

n

n-1



n+1

??????


??????

?????? + ℎ − ??????

??????

??????


= − ??????

??????

+ ??????


??????

??????


??????

?????? + ??????

??????−1

??????


??????−1

?????? + ??????

??????+1

??????


??????+1

??????


−??????

0

??????



0

??????


1

−(??????


1

+ ??????


1

)

??????



1

??????


2

−(??????


2

+ ??????


2

)

??????



3

??????


3

−(??????


3

+ ??????


3

)

??????



4

??????


4

−(??????


4

+ ??????


5

)



Transition rate matrix:

Population dynamics: Birth-death process

• Applications

• Predict extinction time

• Chemistry: 

• radioactive transformations: birth process

• New atom decay: death process

• Queueing theory

• M/M/1 model

• M/M/k model


Applications: Adaptive Cruise Control

• Some cruise-control on vehicles use Markov chains to calculate speed

• Operating environment is stochastic

• Road grade on route can also be seen as stochastic



Applications: Adaptive Cruise Control

• Vehicle speed and following distance can be optimized for best-on-

average fuel economy and optimized for travel time

• More in Optimization and Optimal Control in Automotive Systems



Open Problem

Open Problem: Cutoffs

• Stationary distribution: vector that represents the proportion of time 

a given Markov chain occupies each state, given enough time.

• Total variation distance: distance between two probability 

distributions is equal to the max distance assigned to any event by 

the distributions

• Mixing time: number of steps required for a chain for the distance 

from stationary to be small



Open Problem: Cutoffs

• Build a Markov chain via a random walk over the symmetric group on 



symbols

• The resulting chain’s stationary distribution will be uniform; this can 

be interpreted as a “fully mixed” deck.

• Define “cutoff” as the existence of a sequence c(n) such that as 

approaches infinity, the distance to uniformity approaches zero.

• If we use a random-to-random shuffle, does such a cutoff exist?



Open Problem: Cutoffs

http://pages.uoregon.edu/dlevin/MARKOV/markovmixing.pdf



References

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford Info 

Lab.

snap.stanford.edu



https://en.wikipedia.org/wiki/PageRank

https://math.dartmouth.edu/~pw/math100w13/li.pdf

https://en.wikipedia.org/wiki/Population_process

https://en.wikipedia.org/wiki/Birth%E2%80%93death_process

https://en.wikipedia.org/wiki/Queueing_theory

https://en.wikipedia.org/wiki/Andrey_Markov

http://setosa.io/ev/markov-chains/

http://www.suaybarslan.com/birthdeathprocess4datamodelling.pdf

Hayes, B. (2013). First Links in the Markov Chain. American Scientist, 101(2), 92. 

Harald Waschl, Ilya Kolmanovsky, Maarten Steinbuch, Luigi del Re. Optimization and Optimal Control in Automative Systems.

https://books.google.com/books?id=ebS5BQAAQBAJ&lpg=PA148&dq=how%20does%20cruise%20control%20work%20mark

ov%20chains&pg=PA142#v=onepage&q=how%20does%20cruise%20control%20work%20markov%20chains&f=false



Discussion

Questions Revisited

• What is the matrix A given the following graph when calculating 

PageRank according to the simple definition of PageRank?

0

1



3

2

• What is the first name of the man credited for Markov Chains?



• What is one application that you have used personally of Markov 

Chains?


Do'stlaringiz bilan baham:


Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2017
ma'muriyatiga murojaat qiling