Lecture Notes in Computer Science

Part I, LNCS 4984, pp. 93–101, 2008

bet	11/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 7 8 9 10 11 12 13 14 ... 88

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 93–101, 2008.

c Springer-Verlag Berlin Heidelberg 2008

A. Ichiki and M. Shiino

involvement of synaptic noise can be analyzed by the cavity method to derive

the TAP equation in the thermodynamic limit, since the energy concept appears

as an eﬀective Hamiltonian in this limit. It is natural to consider such neural

network models with ﬂuctuating synaptic couplings, since the synaptic couplings

in real biological systems are updated by learning rules and the time-sequence of

the synaptic couplings may be stochastic under the inﬂuence of noisy external

stimuli. Thus the study on such networks is required to understand the retrieval

process of the realistic networks. The aim of this paper is two-fold: (i) we will

investigate the networks with which realization of synaptic noise have the en-

ergy concept to apply the cavity method to derive the TAP equations, (ii) we

will show the TAP equations for the networks when the concept of the eﬀective

Hamiltonian appears.

This paper is organized as the follows: in the next section, we will brieﬂy review

how the energy concept appears in the network with synaptic noise and derive the

TAP equations and the order parameter equations by using the cavity method

and the SCSNA [8]. Once the eﬀective Hamiltonian is found, the replica method

is also applicable to derive the order parameter equations. However in the present

paper, to make clear the relationship between the TAP equations and the order

parameter equations, we do not use the replica trick. In section 3, we will inves-

tigate the cases where the energy concept appears in the networks with synap-

tic noise. We will see that the TAP equations and the order parameter equa-

tions for some models can be derived in the framework mentioned in section 2.

We will also mention that some diﬃculties to derive the TAP equations arise

in some models with other involvements of synaptic noise. In the last section,

we will make discussions on the structure of the TAP equations for the network

with temporally ﬂuctuating synaptic noise and conclude this paper.

Brief Review on Eﬀective Hamiltonian, TAP Equations

and Order Parameter Equations

In this section, we brieﬂy review the cavity method becomes applicable to the

network with ﬂuctuating synaptic couplings [8]. Then we derive the TAP equa-

tions and the order parameter equations self-consistently in the framework of

the SCSNA.

In this section, we deal with the following stochastic analog neural network

of N neurons with temporally ﬂuctuating synaptic noise (multiplicative noise):

˙x

−φ (x

) +

j(=i)

(t)x

+ η

(t),

(1)

(t)η

(t ) = 2Dδ

δ(t

− t ),

(2)

where x

(i = 1,

· · · , N) represents a state of the neuron at site i taking a

continuous value, φ(x

) is a potential of an arbitrary form which determines the

probability distribution of x

in the case without the input

j(=i)

, η

the

Langevin white noise with its noise intensity 2D and J

(t) the synaptic coupling.

TAP Equation for Associative Memory Neural Network Models

We note here that, in the case of associative memory neural network, the synaptic

coupling J

is usually deﬁned by the well-known Hebb learning rule. However,

in the present paper, we will deal with the coupling J

ﬂuctuating around the

Hebb rule with a white noise:

(t) = ¯

(t),

(3)

(t)

(t ) =

2 ˜

δ(t

− t ),

(4)

where ¯

is deﬁned by the usual Hebb learning rule ¯

≡

μ=1

with

p = αN the number of patterns embedded in the network, ξ

i

=

±1 is the μ

embedded pattern at neuron i, and

(t) denotes the synaptic noise independent

of η

i

(t), which we assume in the present model as a white noise with its intensity

2 ˜

D/N .

Using Ito integral, we obtain the Fokker-Planck equation corresponding to the

Langevin equation (1) as

∂P (t, x)

∂t

−

i=1

∂

∂x

⎧

⎨

⎩

−φ (x

) +

j(=i)

− D + ˜

D ˆ

∂

∂x

⎫

⎬

⎭

P (t, x), (5)

where ˆ

q

≡

j(=i)

. Since the self-averaging property holds in the thermody-

namic limit N

→ ∞, ˆq is identiﬁed as

q =

i=1

(6)

The order parameter ˆ

q is obtained self-consistently in our framework as seen be-

low. Supposing ˆ

q is given, one can easily ﬁnd the equilibrium probability density

for the Fokker-Planck equation (5) as

(x) = Z

−1

exp

⎧

⎨

⎩

−β

eﬀ

⎛

⎝

i=1

φ(x

)

−

⎞

⎠

⎫

⎬

⎭

(7)

where Z denotes the normalization constant and

−1

eﬀ

≡ D + ˜

D ˆ

(8)

plays the role of the eﬀective temperature of the network. The temperature of the

system is modiﬁed as a consequence of the multiplicative noise and it depends on

the order parameter ˆ

q. Notice here that the equilibrium distribution of the system

becomes Gibbs distribution in the thermodynamic limit N

→ ∞. The equilib-

rium solution for equation (5) in the ﬁnite N -body system indeed diﬀers from the

probability density (7). However, since

N

j(=i)

−

= O(1/

√

N ), the

diﬀerence between the probability densities in the ﬁnite N -body system P

and

in the system in the thermodynamic system P

→∞

is P

→∞

−P

= O(1/

√

N ).

A. Ichiki and M. Shiino

Thus one can conclude that the equilibrium density for equation (5) converges

to the probability density (7) in the thermodynamic limit N

→ ∞. Since

we have explicitly written down the equilibrium probability density (7) as a

Gibbsian form, one can deﬁne the eﬀective Hamiltonian of (suﬃciently large)

N -body system as

N

≡

i=1

φ(x

)

−

(9)

Since we have found the eﬀective Hamiltonian and the eﬀective temperature,

one can apply the usual cavity method [3] to this system and derive the TAP

equation.

According to the cavity method, we divide the Hamiltonian of N -body system

(9) into that of (N

−1)-body system and the part involving the state of i

neuron

= φ(x

)

− h

+ H

−1

where h

≡

j(=i)

is the local ﬁeld at site i and the Hamiltonian of (N

−1)-

body system H

−1

is given as H

−1

≡

j(=i)

φ(x

)

−

. The

marginal probability density of x

and the local ﬁeld h

is given as

N

(x

, h

) =

⎡

⎣

j(=i)

⎤

⎦ δ

⎛

⎝h

−

j(=i)

⎞

⎠ P

(x)

= ˜

−1

exp

{ −β

eﬀ

[φ(x

)

− h

]

} P

−1

where ˜

Z is the normalization constant and P

−1

) denotes the probability

density of the local ﬁeld h

in the (N

− 1)-body system deﬁned as

−1

)

≡ Z

−1

⎡

⎣

j(=i)

⎤

⎦ δ

⎛

⎝h

−

j(=i)

⎞

⎠ exp [−β

eﬀ

−1

] ,

where Z

−1

denotes the normalization constant. Since the local ﬁeld is given as

the summation of a suﬃciently large number of random variables and their cross-

correlations are expected to be O(1/

√

N ), one can expect that P

−1

) turns

out to be a Gaussian density in the thermodynamic limit N

→ ∞ according to

the central limit theorem:

−1

) =

√

2πσ

exp

−

− h

i N

−1

)

2σ

where

−1

represents the thermal average with respect to the (N

− 1)-body

probability density P

−1

(x) and σ

is the variance of P

−1

(h

), which is eval-

uated later self-consistently in the framework of the SCSNA. Then taking the

TAP Equation for Associative Memory Neural Network Models

average of x

with respect to the marginal probability P

(x

i

, h

) straightfor-

wardly yields

= F ( h

i N

−1

(10)

where F is a transfer function deﬁned as

F (y)

≡

dx x exp

−β

eﬀ

φ(x)

− yx −

eff

dx exp

−β

eﬀ

φ(x)

− yx −

eff

(11)

Similarly h

i N

−1

is obtained as

i N

−1

= h

− β

eﬀ

Thus we have the pre-TAP equation

= F

⎛

⎝

j(=i)

− Γ

Ons

⎞

⎠ ,

(12)

where Γ

Ons

≡ β

eﬀ

. Since the concrete form of the transfer function F depends

on the eﬀective temperature β

eﬀ

and the variance of the local ﬁeld σ

, it is

necessary to obtain β

eﬀ

and σ

to have the TAP equation [7,9].

Then we can apply the SCSNA to equation (12) to determine β

eﬀ

and σ

self-

consistently. We assume that the only one condensed pattern

{ξ

} is retrieved.

Using the overlap order parameter m

≡

i=1

and the concept of the

SCSNA [7,9], we have

i

= ξ

+ ξ

+ z

iμ

+ Γ

SCSNA

(13)

where

≥2

= ξ

+ z

iμ

+ γ x

, Γ

SCSNA

≡ γ − α and z

iμ

is a Gaus-

sian random variable with zero mean. We will evaluate the overlap m

self-

consistently and obtain z

iμ

and γ. Substituting equation (13) into the pre-TAP

equation (12) reads

= F ξ

+ ξ

+ z

iμ

+ (Γ

SCSNA

− Γ

Ons

) x

and comparing this equation with equation (10) yields [7]

SCSNA

= Γ

Ons

since h

i N

−1

is considered to be a Gaussian random variable which should not

contain the Onsager reaction term. Noting that m

= O(1/

√

N ) for μ

≥ 2, one

can obtain the overlap for noncondensed patterns as

μ

=

N (1

− U)

j=1

F (ξ

+ z

jμ

(14)

≡

j=1

F (ξ

+ z

jμ

(15)

A. Ichiki and M. Shiino

where F denotes the derivative of the transfer function F and the order expan-

sion of F with respect to 1/

√

N has been applied to x

= F (ξ

iμ

+ξ

Using equation (14) and the deﬁnitions of z

iμ

and γ, one ﬁnds

γ =

− U

iμ

N (1

− U)

ν(=1,μ) j(=i)

F (ξ

+ z

jν

Thus the variance of z

iμ

is evaluated as

2

z

− U)

(ξm

+ z)

ξ,z

(16)

where

ξ,z

represents the average over a random variables ξ =

±1 and the

Gaussian variable z, and the self-averaging property has been used. Similarly

one obtains the set of order parameter equations as

1

= ξF (ξm

+ z)

ξ,z

(17)

U = F (ξm

+ z)

ξ,z

(18)

Ons

= Γ

SCSNA

αU

− U

(19)

For the present model, it is required to evaluate the order parameter ˆ

q to deter-

mine the form of the transfer function F . Since ˆ

q is related to the susceptibility

and, by deﬁnition of F (11), the order parameter U corresponds with the sus-

ceptibility as U = β

eﬀ

( x

− x

), one ﬁnds

q =

eﬀ

− U)

(20)

The set of equations (16), (17), (18), (19), (20) takes a closed form and one can

determine the form of F self-consistently as well as the set of order parameters.

Therefore substituting into the pre-TAP equation (12) the solutions β

eﬀ

and

Ons

that are self-consistently obtained within this framework yields the TAP

equation.

Models with Eﬀective Hamiltonian and Eﬀective

Temperature

In this section, we examine the applicability of the concept shown in the previ-

ous section for the network models with various types of involvement of synaptic

noise. As is shown in the previous section, the network has the eﬀective Hamilto-

nian when the equilibrium density becomes Gibbsian one in the thermodynamic

limit. Since the forms of the equilibrium probability densities are estimated by

investigating the forms of the Fokker-Planck equations corresponding to the

TAP Equation for Associative Memory Neural Network Models

original models, we will see the concrete forms of the drift and diﬀusion coef-

ﬁcients of the Fokker-Planck equations in the thermodynamic limit. Once the

Fokker-Planck equation for each model is obtained, the form of the equilibrium

distribution is estimated and hence we can know whether the network has energy

concept in the thermodynamic limit.

We deal with the following models: the state of i

neuron x

obeys the dy-

namics (1) and (2). The synaptic couplings J

are given in Table 1:

Table 1.

case

(

noise

(i)

(

t ) = 2δ

δ(t − t )/ ˜βN

(ii)

(

t ) = 2/ ˜βNδ

(

t − t )

(iii)

(

t)˜

(

t )

√

/ ˜βN

(

t − t )

(

t)˜

(

t )

√

/ ˆβN

−γ

(

t − t )

(

t)˜

(

t ) = 0

(iv)

(1 +

(

t))

(

t ) = 2δ

δ(t − t )/ ˜β

(v)

(1 +

(

t))

(

t ) = 2δ

δ(t − t )/ ˜β

(vi)

(1 +

(

t))

(

t ) = 2/ ˜βδ

(

t − t )

(vii) ¯

(1 +

(

t)˜

(

t))

(

t )

√

(

t − t )/ ˜βN

˜

i

(

t)˜

(

t )

√

(

t − t )/ ˆβN

−γ

(

t)˜

(

t ) = 0

(viii)

(

t ) = 2δ

δ(t − t )/ ˜βN

(ix)

¯

J

(

t ) = 2δ

δ(t − t )/ ˜βN

(x)

¯

J

(1 +

(

t))

(

t ) = 2δ

δ(t − t )/ ˜β

(xi)

¯

J

+ (

(

t) (t ) = 2δ(t − t )/ ˜βN

(xii)

(1 + (

t))

(

t) (t ) = 2δ(t − t )/ ˜β

Note that, in Table 1,

· denotes the average with respect to the noise, β and

β relate with the noise intensity of

and ˜

respectively and γ arbitrary real

number. The case (i) has been treated in the previous section.

Using Ito integral, we can straightforwardly ﬁnd the diﬀusion coeﬃcient D

(2)

ij

for each model as shown in Table 2. In Table 2, the order parameters ˆ

q and M

are deﬁned as ˆ

≡

i

/N , M

≡

√

N and the local ﬁeld is deﬁned as

i

≡

j(=i)

. The drift coeﬃcients D

(1)

for all cases are same as D

(1)

i

=

−φ (x

) +

j(=i)

Thus there exists the energy concept or the eﬀective Hamiltonian for the mod-

els (i)-(vii). In these models, the eﬀective temperature 1/β

eﬀ

= D

(2)

also exits.

100

A. Ichiki and M. Shiino

Therefore the TAP equations and the order parameter equations for these models

can be obtained in the framework described in the previous section. However, in

the case (viii), the diﬀusion coeﬃcient D

(2)

contains non-zero oﬀ-diagonal ele-

ments and hence the equilibrium distribution does not take the form of the Gibbs

distribution. This diﬃculty occurs also in the cases (xi) and (xii). In the case

(ix), the eﬀective Hamiltonian and the eﬀective temperature exist. However, in

this case, evaluating the quantity M is required to determine the eﬀective tem-

perature. For the quantity M the self-averaging property does not hold and thus

the eﬀective temperature can not be found in the framework mentioned previ-

ously. In the case (x), the local ﬁeld h

itself is the thermally ﬂuctuating random

variable, and hence the dynamics for the local ﬁelds h

’s are also required to

estimate the eﬀective temperature. Note that the equilibrium distribution for

this model may not become the Gibbs distribution since the dynamics of the

local ﬁelds h

’s should be also considered. In the case (xi), there exist the oﬀ-

diagonal elements in the diﬀusion coeﬃcient and it depends on M , which can

not be evaluated by the self-averaging property. In the case (xii), the oﬀ-diagonal

elements on the diﬀusion coeﬃcient also exist and it depends on the temporally

ﬂuctuating random variables h

’s. Therefore deriving the TAP equations and the

order parameter equations for the models (viii)-(xii) is the future problem.

Table 2.

case

(2)

energy concept

(i)

(1

/β + ˆq/ ˜β)δ

exist

(ii)

/β + ˆq/ ˜β)δ

exist

(iii) (1

/β + ˆq/ ˜βˆβ)δ

exist

(iv) (1

/β + αˆq/ ˜β)δ

exist

(v)

/β + αˆq/ ˜β)δ

exist

(vi) (1

/β + αˆq/ ˜β)δ

exist

(vii) (1

/β + αˆq/ ˜βˆβ)δ

exist

(viii)

/β + ˆq/ ˜β

not exist

(ix) (1

/β + M

/ ˜β)δ

not exist

(x)

(1

/β + h

/ ˜β)δ

not exist

(xi)

δ

ij

/β + M

/ ˜β

not exist

(xii)

/β + h

/ ˜β

not exist

Conclusion

In this paper we have derived the TAP equations and the order parameter equa-

tions for the neural network models with synaptic noise. In the models presented

TAP Equation for Associative Memory Neural Network Models

101

in this paper, the concept of energy or the Hamiltonian does not exist in the orig-

inal ﬁnite N -body system. As we have seen in the previous section, the eﬀective

Hamiltonian and the eﬀective temperature exist in the thermodynamic limit and

then the TAP equations together with the order parameter equations are derived

by the use of the cavity method and the SCSNA as shown in section 2 for the

cases (i)-(vii). For the cases (viii)-(xii), however, some diﬃculties mentioned in

the previous section arise. To derive the TAP equations and the order parameter

equations for these cases remains as the future problem.

One of the authors (A. I.) was supported by the 21st Century COE Program at

Tokyo Tech ”Nanometer-Scale Quantum Physics” by the Ministry of Education,

Culture, Sports, Science and Technology.

References

1. Sherrington, D., Kirkpatrick, S.: Solvable Model of a Spin-Glass. Phys. Rev. Lett. 35,

1792–1796 (1975)

2. Amit, D.J., Geutfreund, H., Sompolinsky, H.: Storing Inﬁnite Numbers of Patterns

in a Spin-Glass Model of Neural Networks. Phys. Rev. Lett. 55, 1530–1533 (1985)

3. M´

ezard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. World

Scientiﬁc, Singapore (1987)

4. Thouless, D.J., Anderson, P.W., Palmer, R.G.: Solution of ’solvable model of a spin

glass’. Philos. Mag. 35, 593–601 (1977)

5. Morita, T., Horiguchi, T.: Exactly solvable model of a spin glass. Solid. State.

Comm. 19, 833–835 (1976)

6. Shiino, M., Fukai, T.: Self-consistent signal-to-noise analysis and its application to

analogue neural networks with asymmetric connections. J. Phys. A: Math. Gen. 25,

L375-L381 (1992)

7. Shiino, M., Yamana, M.: Statistical mechanics of stochastic neural networks: Re-

lationship between the self-consistent signal-to-noise analysis. Thouless-Anderson-

Palmer equation, and replica symmetric calculation approaches. Phys. Rev. E. 69,

011904-1-13 (2004)

8. Ichiki, A., Shiino, M.: Thouless-Anderson-Palmer equation for analog neural network

with temporally ﬂuctuating white synaptic noise. J. Phys. A: Math. Theor. 40, 9201–

9211 (2007)

9. Shiino, M., Fukai, T.: Self-consistent signal-to-noise analysis of the statistical be-

havior of analog neural networks and enhancement of the storage capacity. Phys.

Rev. E. 48, 867–897 (1993)

Spike-Timing Dependent Plasticity in

Recurrently Connected Networks with Fixed

External Inputs

Matthieu Gilson

1,2,3

, David B. Grayden

1,2,3

, J. Leo van Hemmen

,

Doreen A. Thomas

1,3

, and Anthony N. Burkitt

1,2,3

1

Department of Electrical and Electronic Engineering, The University of Melbourne

The Bionic Ear Institute, Melbourne, Australia

NICTA, The Victorian Research Lab, The University of Melbourne

Physik Department, die Technische Universit¨

at M¨

unchen, Germany

mgilson@bionicear.org

Abstract. This paper investigates spike-timing dependent plasticity

(STDP) for recurrently connected weights in a network with ﬁxed ex-

ternal inputs (homogeneous Poisson pulse trains). We use a dynamical

system to model the network activity and predict its asymptotic evolu-

tion, which turns out to qualitatively depend on the learning parameters

and the correlation structure of the inputs. Our predictions are supported

by numerical simulations of Poisson neuron networks in general cases as

well as for certain cases when using Integrate-And-Fire (IF) neurons.

Keywords: Spiking neurons, neurodynamics, learning, STDP, dynami-

cal system, recurrent network.

Introduction

The hypothesis that changes in the eﬃcacies of neuronal connections depend

upon the correlations in timing of pre- and postsynaptic action potentials (or

spikes) has received considerable experimental support [1]. Such a learning mech-

anism is related to the notion of Hebbian learning [6] and can implement func-

tional organisation in neural networks. The understanding of the underlying

mechanisms behind learning in the brain is crucial both from a physiological

level and for applications (eg. neural prostheses, robotics).

Spike-timing dependent plasticity (STDP, [3]) is a fruitful candidate for this

mechanism, that has been analysed in previous studies for feed-forward network

architectures of Poisson neurons [7,8]. This theory was recently extended to the

case of recurrently connected architectures, and applied to a fully connected

network with no external input [2].

This paper pursues the development of this framework [2] based on the linear

Poisson neuron model and additive STDP (as described in Sec. 2). The net-

work activity is characterised by the ﬁring rates of and the correlations between

the neurons. The evolution of the network activity (or neural dynamics) is de-

scribed by a dynamical system (as discussed in Sec. 3). It is analysed in terms of

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 102–111, 2008.

c Springer-Verlag Berlin Heidelberg 2008

Spike-Timing Dependent Plasticity in Recurrently Connected Networks

103

equilibria (or ﬁxed points) and stability in order to predict the asymptotic net-

work behaviour when stimulated by steady external inputs (Sec. 4).

Models

2.1

Linear Poisson Neuron

In this study, we use the linear Poisson neuron [7]. The neuronal output ﬁring

mechanism is approximated by an inhomogeneous Poisson process generated by

the Poisson parameter ρ(t) (it determines the probability of spiking an action

potential, considered as an instantaneous event); ρ(t) evolves according to the

sum of all the synaptic contributions (cf. Fig. 1.A)

ρ(t) = ν

(t)

− t

k,n

− d

)

(1)

As in [7], the total synaptic inﬂux is the temporal and spatial (over the synapses)

summation of the post-synaptic potentials (PSP) modelled by the synaptic re-

sponse kernel

weighted by the synaptic eﬃcacies J

(or weights; note that we

only consider positive weights here).

models the current injected into the post-

synaptic neuron induced by one single spike (indexed by n) at time t

k,n

at the

synaptic connection, after the ﬁxed delay d

. Finally, ν

is the spontaneous

ﬁring rate caused by background activity (identical for all the neurons).

2.2

Additive STDP

Gerstner et al. [3] ﬁrst proposed a framework to study additive STDP, where the

change of synaptic weight J

induced by a sole pair of pre- and post-synaptic

pulses at times (t

, t

out

) is

ΔJ

∝ w

+ w

out

+ W (t

− t

out

(2)

is the weight of the k

synapse. w

and w

out

are the rate-based learning

parameters (one contribution per pulse at the pulse time); W is the STDP

learning window (cf. Fig. 1.B) so that a synaptic weight is potentiated for a

given pulse pair if a presynaptic input precedes a postsynaptic spike (i.e. “takes

part” in the output ﬁring), and is depressed otherwise [3]. Usually, such changes

are scaled by a learning rate η.

2.3

Link with the Physiology

In our abstract model, the neurons are excited by external inputs, which convey

“neural information” that is assumed to be encoded in their ﬁring rate and

correlation structures. The neurons also have a spontaneous activity that can

be understood as a consequence of a background activity. There are thus two

sources of stochasticity, which can be interpreted as follows: one is related to

the external inputs (pulse trains described by their ﬁring rate and correlation

104

M. Gilson et al.

Fig. 1.

A: Linear Poisson neuron model. The output pulse train S(t) (top) is generated

using the inhomogeneous Poisson parameter ρ(t) (middle). Each pre-synaptic pulse at

the k

synapse (bottom) induces a variation of ρ determined by the post-synaptic re-

sponse kernel , the synaptic weight J

(t) and the delay d

. B : STDP window function

W (in arbitrary unit). Each side of the real axis is determined by a negative exponential:

→ c

exp(

−|u|/τ

) with the index x = D for the depression part (u > 0) and x = P

for the potentiation part (u < 0), cf. [2, Sec. 5] for details and the parameter values.

structures) and it can be linked to the variability observed in physiological data;

the other is due to the additional impact of the background activity upon the

generation of the spike output of each neuron (intrinsic stochasticity modelled

by the Poisson process).

Theoretical Analysis

3.1

Characterisation of the Neural Activity

We consider a network of N Poisson neurons (referred to as internal neurons)

stimulated with M Poisson pulse trains (or external inputs), as shown in Fig. 2.

The activity of the neural network can be described using the ﬁring rates and

the pairwise correlations, plus the weights. In the terminology of the theory

of dynamical systems, these variables characterise the “state” of the network

activity at each time t and their evolution is referred to as neural dynamics.

Similar to [7,2], we consider the time-averaged ﬁring rates ν

(t) (for the inter-

nal neuron indexed by i; T is a given time period)

(t)

−T

(t ) dt

(3)

where S

(t) is the spike-time series of the i

neuron and the brackets . . .

denotes the ensemble averaging. Likewise for the coeﬃcient correlations (time-

average correlation function convoluted with the STDP window function W ):

W

ik

(t) between the i

internal neuron and the k

external input

W

ik

(t)

∞

−∞

W (u)

−T

(t ) ˆ

(t + u) dt

(4)

and Q

(t) between the i

and j

internal neurons (with S

and S

) [2, Sec. 3].

Spike-Timing Dependent Plasticity in Recurrently Connected Networks

105

Fig. 2. Presentation of the network and the notation. The internal neurons (within

the network) are indexed by i

∈ [1..N], and their output pulse trains are denoted

(t) (it can be understood as a sum of “Dirac functions” at each spiking time [2, Sec.

2]). Likewise for the external input pulse trains ˆ

k

(t) (k

∈ [1..M]). The time-averaged

ﬁring rates are denoted by ν

(t), cf. Eq. 3; the correlation coeﬃcients within the network

by Q

(t) (resp. D

(t) between a neuron in the network and an external input; and

Q

kl

(t) between two external inputs), cf. Eq. 4. The weight of the connection from the

external input onto the i

internal neuron is denoted K

(t) (resp. J

(t) from the

internal neuron onto the i

internal neuron).

The stimulation parameters (which represent the “information” carried in the

external inputs) are determined by the time-averaged input spiking rates (ˆ

k

(t),

deﬁned similarly to ν

(t)) and their correlation coeﬃcients ( ˆ

(t), deﬁned simi-

larly to D

ik

(t) and Q

(t)) [2, Sec. 3]. In this paper, we only consider stimulation

parameters that are constant in time.

3.2

Learning Equations

Learning equations can be derived from Eq. 2 as in Kempter et al. [7]. This re-

quires the assumption that the internal pulse trains are statistically independent

(this can be considered valid when the number of recurrent connection is large

enough) and a small learning rate η. This leads to the matrix equation Eq. 10 for

the weights between internal neurons (resp. to Eq. 9 for the input weights K).

3.3

Activation Dynamics

In order to study the evolution of the weights described by Eqs. 9 and 10, we

need to evaluate the neuron time-average ﬁring rates (the vector ν(t)) and their

time-average correlation coeﬃcients (the matrices D

(t) and Q

(t)). Similar

to Kempter et al. [7] and Burkitt et al. [2], we approximate the instantaneous

ﬁring rate S

(t) of the i

Poisson neuron by its expected inhomogeneous Pois-

son parameter ρ(t) (cf. Eq. 1) and we neglect the impact of the short-time

dynamics (the synaptic response kernel

and the synaptic delays ˆ

and d

)

by using time averaged variables (over a “long” period T ). We require that the

106

M. Gilson et al.

learning occurs slowly compared to the activation mechanisms (cf. the neuron

and synapse models) so that T is large compared to the time scale of these mech-

anisms but small compared to the inverse of the learning parameter η

−1

. This

leads to the consistency matrix equations of the ﬁring rates (Eq. 6) and of the

correlation coeﬃcients (Eqs. 7 and 8). See Burkitt et al. [2, Sec. 3] for details of

the derivation.

Note that the consistency equations of the correlation coeﬃcients as deﬁned

in [2, Sec. 3 and 4] have been reformulated to express the usual covariance using

the assumption that the correlations are quasi-constant in time [5] (this implies

that ˆ

Q

V T

[2, Sec. 3] is actually equal to ˆ

). Eqs. 7 and 8 express the impact

of the connectivity (through the term [I

− J]

−1

K) on the internal ﬁring rates

and the cross covariances in terms of the input covariance ˆ

− W ˆν ˆν

(5)

where W

W (u) du evaluates the balance between the potentiation and the

depression of our STDP rule. These equations are obtained by combining the

equations in [2] with the ﬁring-rate consistency equation Eq. 6.

3.4

Network Dynamical System

Putting everything together, the network dynamics is described by

ν = [I

− J]

−1

E + K ˆ

(6)

− W ν ˆν

= [I

− J]

−1

− W ˆν ˆν

(7)

− W ν ν

= [I

− J]

−1

− W ˆν ˆν

− J]

−1T

(8)

= Φ

E ˆ

+ w

out

ν ˆ

+ D

(9)

= Φ

E ν

+ w

out

ν E

+ Q

(10)

where E is the unit vector of N elements (likewise for ˆ

E with M elements); Φ

is a projector on the space of N

×N matrices to which J belongs [2, Sec. 3]. The

eﬀect of Φ

is to nullify the coeﬃcients corresponding to a missing connection in

the network, viz. all the diagonal terms because the self-connection of a neuron

onto itself is forbidden. More generally, such an projection operator can account

for any network connectivity [2, Sec. 3].

Note that time has been rescaled, in order to remove η from these equations

and for simplicity of notation the dependence on time t will be omitted in the

rest of this paper. The matrix [I

− J(t)] is assumed invertible at all time (the

contrary would imply a diverging behaviour of the ﬁring rates [2, Sec. 4]).

Recurrent Network with Fixed Input Weights

We now examine the case of a recurrently connected network with ﬁxed input

weights K and learning on the recurrent weights J . Only Eqs. 6, 8 and 10 remain,

Spike-Timing Dependent Plasticity in Recurrently Connected Networks

107

which simpliﬁes the analysis. In the case of full recurrent connectivity, Φ

Eq. 10 only nulliﬁes the diagonal terms of the square matrix in its argument.

4.1

Analytical Predictions

Homeostatic equilibrium. Similarly to [7,2], we derive the scalar equations of the

mean ﬁring rate ν

−1

and weight J

(N (N

− 1))

−1

i=j

. It

consists of neglecting the inhomogeneities of the ﬁring rates and of the weights

over the network, as well as of the connectivity, and we obtain

av

=

+ (K ˆ

ν)

− (N − 1)J

(11)

= w

+ w

out

+ W + C ν

where (K ˆ

ν)

av

denotes the mean of the matrix product K ˆ

ν and C is deﬁned as

(K ˆ

)

[ν

+ (K ˆ

ν)

]

(12)

Eqs. 11 are thus the same equations as for the case with no input [2, Sec. 5], with

and W replaced by, resp., ν

+ (K ˆ

ν)

and W + C. It qualitatively implies the

same dynamical behaviour and, provided that W + C < 0, the network exhibits

a homeostatic equilibrium (when it is realisable, it requires in particular μ > 0),

and the means of the ﬁring rates and of the weights converge towards

∗

= μ

−

+ w

out

W + C

(13)

∗

− ν

− (K ˆν)

− 1)μ

If the correlation inputs are time-invariant functions and if the correlations

are positive (i.e. more likely to ﬁre in a synchronous way) and homogeneous

among the correlated input pool, it follows that C is of the same sign as W .

Therefore, the condition for stability reverts to W < 0 as in [7,2].

Structural dynamics. For the particular case of uncorrelated inputs (or more

generally when K ˆ

= 0), the ﬁxed-point structure of the dynamical sys-

tem is qualitatively the same as for the case of no external input [2, Sec. 5]: a

homogeneous distribution for the ﬁring rates and a continuous manifold of ﬁxed

points for the internal weights.

In the case of “spatial” inhomogeneities over the input correlations, the net-

work dynamics shows a diﬀerent evolution. To illustrate and to compare this

with the case of feed-forward connections, we consider the network conﬁgura-

tion described in Fig. 3 and inspired by [7], where one input pool has correlation

while the other pool has uncorrelated sources. In general, the equations that

108

M. Gilson et al.

Fig. 3. Architecture of the simulated network. The inputs are divided into two sub-

pools, each feeding half of the internal network with means K

and K

for the input

weights from each subpool. Similarly to the recurrent weights J and the ﬁring rates ˆ

and ν, the inhomogeneities are neglected within each subpool and they are assumed

to be all identical to their mean. The weights and delays are initially set with 10%

random variation around a mean.

determine the ﬁxed points have no accurate solution. Yet, we can reduce the

dimensionality by neglecting the variance within each subpool and make ap-

proximations to evaluate the asymptotic distribution of the ﬁring rates, which

turns out to be bimodal.

4.2

Simulation Protocol and Results

We simulated a network of Poisson neurons as described in Fig. 3 with random

initial recurrent weights (uniformly distributed in a given interval, as well as

all the synaptic delays). An issue with such simulations is to maintain positive

internal weights during their homeostatic convergence because they individually

diverge. Thus, all equilibria are not realisable depending on the initial distribu-

tions of K and J and the weight bounds [7,2]. See [2, Sec. 5] for details about

the simulation parameters.

An interesting ﬁrst case to test the analytical predictions consists of two pools

of uncorrelated inputs, each feeding half of the network with distinct weights

= K

and ˆ

= 0). Each half of the internal network thus has distinct

ﬁring rates initially and, as predicted, the outcome is a convergence of these

ﬁring-rates towards a uniform value (similar to the case of no external input [2,

Sec. 5]).

In the case of a full connected (for both the K and the J according to Fig. 3)

network stimulated by one correlated input pool (short-time correlation inspired

by [4] so that ˆ

W

= 0) and one uncorrelated one, both the internal ﬁring rates

and the internal weights also exhibit a homeostatic equilibrium. As shown in Fig.

4, the means over the network (thick solid lines) converge towards the predicted

equilibrium values (dashed lines). Furthermore, the individual ﬁring rates tend

to stabilise and their distribution remains bimodal (the subpool #1 excited by

correlated inputs ﬁres at a lower rate eventually when W < 0). The recurrent

weights individually diverge similarly to [2, Sec. 5] and reorganise so that the

outgoing weights from the subpool #1 (see the means over each weight subpool

Spike-Timing Dependent Plasticity in Recurrently Connected Networks

109

10000

20000

time (sec)

firing rates (Hz)

10000

20000

0.005

0.01

0.015

0.02

0.025

0.03

time (sec)

weights

J21

J22

J12

J11

Fig. 4. Evolution of the ﬁring rates (left) and of the recurrent weights (right) for

N = 30 fully-connected Poisson neurons (cf. Fig.3) with short-time correlated inputs.

The outcome is a quasi bimodal distribution of the ﬁring rates (the grey bundle, the

mean is the thick solid line) around the predicted homeostatic equilibrium (dashed

line). The subgroup #1 that receives correlated inputs is more excited initially but ﬁres

at a lower rate at the end of the simulation (cf. the two thin black solid lines which

represent the mean over each subpool). The internal weights individually diverge, while

their mean (thick line) converges towards the predicted equilibrium value (dashed line).

They reorganise themselves so that the weights outgoing from the subpool #2 (that

receives uncorrelated inputs) become silent, while the ones from #1 are strengthened.

Note that the homeostatic equilibrium is preserved even when some weights saturate.

10000

20000

time (sec)

firing rates (Hz)

Fig. 5. Evolution of the ﬁring rates for a partially connected network of N = 75

neurons. Both the K and the J have 40% probability of connection with the same

setup as the network in Fig. 4 (to preserve the total input synaptic strength). The

mean ﬁring rate (very thick line) still converges towards the predicted equilibrium

value (dashed line) and the two subgroups (grey bundles, each mean represented by a

thin black solid line) get separated similarly to the case of full connectivity. The internal

weights (not shown) exhibit similar dynamics as for the case of full connectivity.

11

and J

in Fig. 4) are strengthened while the other ones almost become silent

(see J

12

and J

). In other words, the subpool that receives correlated inputs

takes the upper hand in the recurrent architecture.

110

M. Gilson et al.

10000

time (sec)

firing rates (Hz)

10000

time (sec)

firing rates (Hz)

Fig. 6. Simulation of networks of IF neurons with partial random connectivity of 50%.

The network qualitatively exhibits the expected behaviour in the case of uncorrelated

inputs (left) and inputs with one correlated pool and one uncorrelated pool (right).

In the case of partial connectivity for both the K and the J , the behaviour

of the individual ﬁring rates (cf. Fig. 5) still follows the predictions but they

are more dispersed, their convergence is slower and the bimodal distribution is

not always observed as clearly as in the case of full connectivity (in Fig. 5 the

means of the two internal neuron subpools clearly remain separated though). The

homeostatic equilibrium of the internal weights also holds and they individually

diverge. The partial connectivity needs to be rich enough for the predictions of

the mean variable to be accurate enough.

First results with IF neurons comply with the analytical predictions remain

valid even if the activation mechanisms are more complex (here with a connec-

tivity of 50%, cf. Fig. 6).

Discussion and Future Work

The analytical results presented here are preliminary, and further investigation

is needed to gain better understanding of the interplay between the input cor-

relation structure and STDP. Nevertheless, our results illustrate two points:

STDP induces a stable activity in recurrent architectures similar to that for

feed-forward ones (homeostatic regulation of the network activity under the con-

dition W < 0); and the qualitative structure of the internal ﬁring rates is mainly

determined by the input correlation structure. Namely, a “poor” correlation

structure (uncorrelated or delta-correlated inputs, so that ˆ

W

= 0) induces a

homogenisation of the ﬁring activity. Finally, partial connectivity impacts upon

the structure of the internal ﬁring rates, but such networks still exhibit similar

behaviour to fully connected ones.

Preliminary results involving more complex patterns of K ˆ

W

K

suggest

more complex interplay between the input correlation structure and the equi-

librium distribution of the network ﬁring rates and weights. Such cases are

under investigation and may constitute more “interesting” dynamic behaviour

of the network from a cognitive modelling point of view, namely through the

Spike-Timing Dependent Plasticity in Recurrently Connected Networks

111

relationship between the attractors of the network activity and the input struc-

ture. The case of learning for both the input connections and the recurrent ones

will also form part of a future study.

Comparison with IF neurons suggests that the impact of the neuron activation

mechanisms on the weight dynamics may not be signiﬁcant. It can be linked to

the separation of the time scales between them in the case of slow learning.

Note that with our approximations the IF neurons are assumed to be in “linear

input-output regime” (no bursting for instance).

Acknowledgments

The authors thank Iven Mareels, Chris Trengove, Sean Byrnes and Hamish Mef-

ﬁn for useful discussions that introduced signiﬁcant improvements. MG is funded

by two scholarships from The University of Melbourne and NICTA. ANB and

DBG acknowledge funding from the Australian Research Council (ARC Discov-

ery Projects #DP0453205 and #DP0664271) and The Bionic Ear Institute.

References

1. Bi, G.Q., Poo, M.M.: Synaptic modiﬁcation by correlated activity: Hebb’s postulate

revisited. Annual Review of Neuroscience 24, 139–166 (2001)

2. Burkitt, A.N., Gilson, M., van Hemmen, J.L.: Spike-timing-dependent plasticity for

neurons with recurrent connections. Biological Cybernetics 96, 533–546 (2007)

3. Gerstner, W., Kempter, R., van Hemmen, J.L., Wagner, H.: A neuronal learning

rule for sub-millisecond temporal coding. Nature 383, 76–78 (1996)

4. Gutig, R., Aharonov, R., Rotter, S., Sompolinsky, H.: Learning input correlations

through nonlinear temporally asymmetric hebbian plasticity. Journal of Neuro-

science 23, 3697–3714 (2003)

5. Hawkes, A.G.: Point spectra of some mutually exciting point processes. Journal of

the Royal Statistical Society Series B-Statistical Methodology 33, 438–443 (1971)

6. Hebb, D.O.: The organization of behavior: a neuropsychological theory. Wiley,

Chichester (1949)

7. Kempter, R., Gerstner, W., van Hemmen, J.L.: Hebbian learning and spiking neu-

rons. Physical Review E 59, 4498–4514 (1999)

8. van Rossum, M.C.W., Bi, G.Q., Turrigiano, G.G.: Stable hebbian learning from

spike timing-dependent plasticity. Journal of Neuroscience 20, 8812–8821 (2000)

A Comparative Study of Synchrony Measures

for the Early Detection of Alzheimer’s Disease

Based on EEG

Justin Dauwels, Fran¸cois Vialatte, and Andrzej Cichocki

RIKEN Brain Science Institute, Saitama, Japan

justin@dauwels.com,

{fvialatte,cia}@brain.riken.jp

Abstract. It has repeatedly been reported in the medical literature

that the EEG signals of Alzheimer’s disease (AD) patients are less syn-

chronous than in age-matched control patients. This phenomenon, how-

ever, does at present not allow to reliably predict AD at an early stage,

so-called mild cognitive impairment (MCI), due to the large variabil-

ity among patients. In recent years, many novel techniques to quan-

tify EEG synchrony have been developed; some of them are believed

to be more sensitive to abnormalities in EEG synchrony than tradi-

tional measures such as the cross-correlation coeﬃcient. In this paper,

a wide variety of synchrony measures is investigated in the context of

AD detection, including the cross-correlation coeﬃcient, the mean-square

and phase coherence function, Granger causality, the recently proposed

corr-entropy coeﬃcient and two novel extensions, phase synchrony in-

dices derived from the Hilbert transform and time-frequency maps,

information-theoretic divergence measures in time domain and time-

frequency domain, state space based measures (in particular, non-linear

interdependence measures and the S-estimator), and at last, the recently

proposed stochastic-event synchrony measures. For the data set at hand,

only two synchrony measures are able to convincingly distinguish MCI

patients from age-matched control patients (p < 0.005), i.e., Granger

causality (in particular, full-frequency directed transfer function) and

stochastic event synchrony (in particular, the fraction of non-coincident

activity). Combining those two measures with additional features may

eventually yield a reliable diagnostic tool for MCI and AD.

Introduction

Many studies have shown that the EEG signals of AD patients are generally less

coherent than in age-matched control patients (see [1] for an in-depth review).

It is noteworthy, however, that this eﬀect is not always easily detectable: there

tends to be a large variability among AD patients. This is especially the case for

patients in the pre-symptomatic phase, commonly referred to as Mild Cognitive

Impairment (MCI), during which neuronal degeneration is occurring prior to the

clinical symptoms appearance. On the other hand, it is crucial to predict AD at

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 112–125, 2008.

c Springer-Verlag Berlin Heidelberg 2008

A Comparative Study of Synchrony Measures for the Early Detection of AD

113

an early stage: medication that aims at delaying the eﬀects of AD (and hence

intend to improve the quality of life of AD patients) are the most eﬀective if

applied in the pre-symptomatic phase.

In recent years, a large variety of measures has been proposed to quantify EEG

synchrony (we refer to [2]–[5] for recent reviews on EEG synchrony measures);

some of those measures are believed to be more sensitive to perturbations in EEG

synchrony than classical indices as for example the cross-correlation coeﬃcient or

the coherence function. In this paper, we systematically investigate the state-of-

the-art of measuring EEG synchrony with special focus on the detection of AD

in its early stages. (A related study has been presented in [6,7] in the context

of epilepsy.) We consider various synchrony measures, stemming from a wide

spectrum of disciplines, such as physics, information theory, statistics, and signal

processing. Our aim is to investigate which measures are the most suitable for

detecting the eﬀect of synchrony perturbations in MCI and AD patients; we

also wish to better understand which aspects of synchrony are captured by the

diﬀerent measures, and how the measures are related to each other.

This paper is structured as follows. In Section 2 we review the synchrony

measures considered in this paper. In Section 3 those measures are applied to

EEG data, in particular, for the purpose of detecting MCI; we describe the EEG

data set, elaborate on various implementation issues, and present our results. At

the end of the paper, we brieﬂy relate our results to earlier work, and speculate

about the neurophysiological interpretation of our results.

Synchrony Measures

We brieﬂy review the various families of synchrony measures investigated in this

paper: cross-correlation coeﬃcient and analogues in frequency and time-frequency

domain, Granger causality, phase synchrony, state space based synchrony,

information theoretic interdependence measures, and at last, stochastic-event

synchrony measures, which we developed in recent work.

2.1

Cross-Correlation Coeﬃcient

The cross-correlation coeﬃcient r is perhaps one of the most well-known mea-

sures for (linear) interdependence between two signals x and y. If x and y are

not linearly correlated, r is close to zero; on the other hand, if both signals are

identical, then r = 1 [8].

2.2

Coherence

The coherence function quantiﬁes linear correlations in frequency domain. One

distinguishes the magnitude square coherence function c(f ) and the phase co-

herence function φ(f ) [8].

114

J. Dauwels, F. Vialatte, and A. Cichocki

2.3

Corr-Entropy Coeﬃcient

The corr-entropy coeﬃcient r

is a recently proposed [9] non-linear extension of

the correlation coeﬃcient r; it is close to zero if x and y are independent (which

is stronger than being uncorrelated).

2.4

Coh-Entropy and Wav-Entropy Coeﬃcient

One can deﬁne a non-linear magnitude square coherence function, which we will

refer to as “coh-entropy” coeﬃcient c

(f ); it is an extension of the corr-entropy

coeﬃcient to the frequency domain. The corr-entropy coeﬃcient r

can also be

extended to the time-frequency domain, by replacing the signals x and y in the

deﬁnition of r

by their time-frequency (“wavelet”) transforms. In this paper,

we use the complex Morlet wavelet, which is known to be well-suited for EEG

signals [10]. The resulting measure is called “wav-entropy” coeﬃcient w

(f ).

(To our knowledge, both c

(f ) and w

(f ) are novel).

2.5

Granger Causality

Granger causality

refers to a family of synchrony measures that are derived

from linear stochastic models of time series; as the above linear interdependence

measures, they quantify to which extent diﬀerent signals are linearly interde-

pendent. Whereas the above linear interdependence measures are bivariate, i.e.,

they can only be applied to pairs of signals, Granger causality measures are

multivariate, they can be applied to multiple signals simultaneously.

Suppose that we are given n signals X

(k), X

(k), . . . , X

(k), each stemming

from a diﬀerent channel. We consider the multivariate autoregressive (MVAR)

model:

X(k) =

A(j)X(k

− ) + E(k),

(1)

where X(k) = (X

(k), X

(k), . . . , X

(k))

, p is the model order, the model

coeﬃcients A(j) are n

× n matrices, and E(k) is a zero-mean Gaussian random

vector of size n. In words: Each signal X

(k) is assumed to linearly depend on its

own p past values and the p past values of the other signals X

(k). The deviation

between X(k) and this linear dependence is modeled by the noise component

E(k). Model (1) can also be cast in the form:

E(k) =

p

=0

A(j)X(k

− ),

(2)

The Granger causality measures we consider here are implemented in the BioSig

library, available from http://biosig.sourceforge.net/

A Comparative Study of Synchrony Measures for the Early Detection of AD

115

where ˜

A(0) = I (identity matrix) and ˜

A(j) =

−A(j) for j > 0. One can

transform (2) into the frequency domain (by applying the z-transform and by

substituting z = e

−2πiΔt

, where 1/Δt is the sampling rate):

X(f ) = ˜

−1

(f )E(f ) = H(f )E(f ).

(3)

The power spectrum matrix of the signal X(k) is determined as

S(f ) = X(f )X(f )

∗

= H(f )VH

∗

(f ),

(4)

where V stands for the covariance matrix of E(k). The Granger causality mea-

sures are deﬁned in terms of coeﬃcients of the matrices A, H, and S. Due to

space limitations, only a short description of these methods is provided here,

additional information can be found in existing literature (e.g., [4]).

From these coeﬃcients, two symmetric measures can be deﬁned:

– Granger coherence

(f )

| ∈ [0, 1] describes the amount of in-phase com-

ponents in signals i and j at the frequency f .

– Partial coherence (PC)

ij

(f )

| ∈ [0, 1] describes the amount of in-phase

components in signals i and j at the frequency f when the inﬂuence (i.e.,

linear dependence) of the other signals is statistically removed.

The following asymmetric (“directed”) Granger causality measures capture

causal relations:

– Directed transfer function (DTF) γ

ij

(f ) quantiﬁes the fraction of inﬂow to

channel i stemming from channel j.

– Full frequency directed transfer function (ﬀDTF)

2

ij

(f ) =

(f )

j=1

(f )

∈ [0, 1],

(5)

is a variation of γ

ij

(f ) with a global normalization in frequency.

– Partial directed coherence (PDC)

(f )

| ∈ [0, 1] represents the fraction of

outﬂow from channel j to channel i

– Direct directed transfer function (dDTF) χ

ij

(f ) = F

(f )C

(f ) is non-zero

if the connection between channel i and j is causal (non-zero F

(f )) and

direct (non-zero C

ij

(f )).

2.6

Phase Synchrony

Phase synchrony refers to the interdependence between the instantaneous phases

and φ

of two signals x and y; the instantaneous phases may be strongly

synchronized even when the amplitudes of x and y are statistically independent.

The instantaneous phase φ

of a signal x may be extracted as [11]:

(k) = arg [x(k) + i˜

x(k)] ,

(6)

116

J. Dauwels, F. Vialatte, and A. Cichocki

where ˜

x is the Hilbert transform of x. Alternatively, one can derive the instan-

taneous phase from the time-frequency transform X(k, f ) of x:

(k, f ) = arg[X(k, f )].

(7)

The phase φ

(k, f ) depends on the center frequency f of the applied wavelet.

By appropriately scaling the wavelet, the instantaneous phase may be computed

in the frequency range of interest.

The phase synchrony index γ for two instantaneous phases φ

and φ

is deﬁned

as [11]:

γ =

i(nφ

−mφ

)

∈ [0, 1],

(8)

where n and m are integers (usually n = 1 = m). We will use the notation

H

and γ

to indicate whether the instantaneous phases are computed by the

Hilbert transform or time-frequency transform respectively. In this paper, we

will consider two additional phase synchrony indices, i.e., the evolution map

approach (EMA) and the instantaneous period approach (IPA) [12]. Due to

space constraints, we will not describe those measures here, instead we refer

the reader to [12]

; additional information about phase synchrony can be found

in [6].

2.7

State Space Based Synchrony

State space based synchrony (or “generalized synchronization”) evaluates syn-

chrony by analyzing the interdependence between the signals in a state space

reconstructed domain (see e.g., [7]). The central hypothesis behind this ap-

proach is that the signals at hand are generated by some (unknown) deter-

ministic, potentially high-dimensional, non-linear dynamical system. In order to

reconstruct such system from a signal x, one considers delay vectors X(k) =

(x(k), x(k

− τ), . . . , x(k − (m − 1) τ))

, where m is the embedding dimension

and τ denotes the time lag. If τ and m are appropriately chosen, and the signals

are indeed generated by a deterministic dynamical system (to a good approxi-

mation), the delay vectors lie on a smooth manifold (“mapping”) in

, apart

from small stochastic ﬂuctuations.

The S-estimator [13], here denoted by S

est

, is a state space based measure

obtained by applying principal component analysis (PCA) to delay vectors

. We

also considered three measures of nonlinear interdependence, S

, H

, and N

(see [6] for details

).

2.8

Information-Theoretic Measures

Several interdependence measures have been proposed that have their roots

in information theory. Mutual Information I is perhaps the most well-known

Program code is available at www.agnld.uni-potsdam.de/%7Emros/dircnew.m

We used the S Toolbox downloadable from http://aperest.epfl.ch/docs/

software.htm

Software is available from http://www.vis.caltech.edu/~rodri/software.htm

A Comparative Study of Synchrony Measures for the Early Detection of AD

117

information-theoretic interdependence measure; it quantiﬁes the amount of in-

formation the random variable Y contains about random variable X (and vice

versa); it is always positive, and it vanishes when X and Y are statistically in-

dependent. Recently, a sophisticated and eﬀective technique to compute mutual

information between time series was proposed [14]; we will use that method in

this paper

. The method of [14] computes mutual information in time-domain;

alternatively, this quantity may also be determined in time-frequency domain

(denoted by I

), more speciﬁcally, from normalized spectrograms [15,16] (see

also [17,18]).

We will also consider several information-theoretic measures that quantify the

dissimilarity (or “distance”) between two random variables (or signals). In con-

trast to the previously mentioned measures, those divergence measures vanish if

the random variables (or signals) are identical ; moreover, they are not necessar-

ily symmetric, and therefore, they can not be considered as distance measures

in the strict sense. Divergences may be computed in time domain and time-

frequency domain; in this paper, we will only compute the divergence measures

in time-frequency domain, since the computation in time domain is far more

involved. We consider the Kullback-Leibler divergence K, the R´

enyi divergence

, the Jensen-Shannon divergence J , and the Jensen-R´

enyi divergence J

. Due

to space constraints, we will not review those divergence measures here; we refer

the interested reader to [15,16].

2.9

Stochastic Event Synchrony (SES)

Stochastic event synchrony, an interdependence measure we developed in ear-

lier work [19], describes the similarity between the time-frequency transforms

of two signals x and y. As a ﬁrst step, the time-frequency transform of each

signal is approximated as a sum of (half-ellipsoid) basis functions, referred to as

“bumps” (see Fig. 1 and [20]). The resulting bump models, representing the most

prominent oscillatory activity, are then aligned (see Fig. 2): bumps in one time-

frequency map may not be present in the other map (“non-coincident bumps”);

other bumps are present in both maps (“coincident bumps”), but appear at

slightly diﬀerent positions on the maps. The black lines in Fig. 2 connect the

centers of coincident bumps, and hence, visualize the oﬀset in position between

pairs of coincident bumps. Stochastic event synchrony consists of ﬁve parameters

that quantify the alignment of two bump models:

– ρ: fraction of non-coincident bumps,

– δ

and δ

: average time and frequency oﬀset respectively between coincident

bumps,

– s

and s

: variance of the time and frequency oﬀset respectively between

coincident bumps.

The alignment of the two bump models (cf. Fig. 2 (right)) is obtained by iterative

max-product message passing on a graphical model; the ﬁve SES parameters

are determined from the resulting alignment by maximum a posteriori (MAP)

The program code (in C) is available at www.klab.caltech.edu/~kraskov/MILCA/

118

J. Dauwels, F. Vialatte, and A. Cichocki

Fig. 1. Bump modeling

200

400

600

800

Bump models of two EEG signals

f

00

200

400

600

800

Coincident bumps (ρ = 27%)

Fig. 2. Coincident and non-coincident activity (“bumps”); (left) bump models of two

signals; (right) coincident bumps; the black lines connect the centers of coincident

bumps

estimation [19]. The parameters ρ and s

are the most relevant for the present

study, since they quantify the synchrony between bump models (and hence, the

original time-frequency maps); low ρ and s

implies that the two time-frequency

maps at hand are well synchronized.

Detection of EEG Synchrony Abnormalities in MCI

Patients

In the following section, we describe the EEG data we analyzed. In Section 3.2

we address certain technical issues related to the synchrony measures, and in

Section 3.3, we present and discuss our results.

3.1

EEG Data

The EEG data

analyzed here have been analyzed in previous studies concern-

ing early detection of AD [21]–[25]. They consist of rest eyes-closed EEG data

We are grateful to Prof. T. Musha for providing us the EEG data.

A Comparative Study of Synchrony Measures for the Early Detection of AD

119

recorded from 21 sites on the scalp based on the 10–20 system. The sampling

frequency was 200 Hz, and the signals were band pass ﬁltered between 4 and

30Hz. The subjects comprised two study groups. The ﬁrst consisted of a group

of 25 patients who had complained of memory problems. These subjects were

then diagnosed as suﬀering from MCI and subsequently developed mild AD.

The criteria for inclusion into the MCI group were a mini mental state exam

(MMSE) score above 24, the average score in the MCI group was 26 (SD of 1.8).

The other group was a control set consisting of 56 age-matched, healthy subjects

who had no memory or other cognitive impairments. The average MMSE of this

control group was 28.5 (SD of 1.6). The ages of the two groups were 71.9

10.2 and 71.7

± 8.3, respectively. Pre-selection was conducted to ensure that the

data were of a high quality, as determined by the presence of at least 20 sec. of

artifact free data. Based on this requirement, the number of subjects in the two

groups described above was reduced to 22 MCI patients and 38 control subjects.

3.2

Methods

In order to reduce the computational complexity, we aggregated the EEG signals

into 5 zones (see Fig. 3); we computed the synchrony measures (except the

S-estimator) from the averages of each zone. For all those measures except SES,

we used the arithmetic average; in the case of SES, the bump models obtained

from the 21 electrodes were clustered into 5 zones by means of the aggregation

algorithm described in [20]. We evaluated the S-estimator between each pair of

zones by applying PCA to the state space embedded EEG signals of both zones.

We divided the EEG signals in segments of equal length L, and computed

the synchrony measures by averaging over those segments. Since spontaneous

EEG is usually highly non-stationary, and most synchrony measures are strictly

speaking only applicable to stationary signals, the length L should be suﬃciently

small; on the other hand, in order to obtain reliable measures for synchrony, the

length should be chosen suﬃciently large. Consequently, it is not a priori clear

how to choose the length L, and therefore, we decided to test several values, i.e.,

L = 1s, 5s, and 20s.

In the case of Granger causality measures, one needs to specify the model

order p. Similarly, for mutual information (in time domain) and the state space

based measures, the embedding dimension m and the time lag τ needs to be

chosen; the phase synchrony indices IPA and EMA involve a time delay τ . Since

it is not obvious which parameter values amount to the best performance for

detecting AD, we have tried a range of parameter settings, i.e., p = 1, 2,. . . , 10,

and m = 1, 2,. . . , 10; the time delay was in each case set to τ = 1/30s, which is

the period of the fastest oscillations in the EEG signals at hand.

3.3

Results and Discussion

Our main results are summarized in Table 1, which shows the sensitivity of the

synchrony measures for detecting MCI. Due to space constraints, the table only

shows results for global synchrony, i.e., the synchrony measures were averaged

120

J. Dauwels, F. Vialatte, and A. Cichocki

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 7 8 9 10 11 12 13 14 ... 88