Lecture Notes in Computer Science

bet	50/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 46 47 48 49 50 51 52 53 ... 88

5.2 The Testing Procedure

The recognition algorithm can be summarized by the following steps.

Step 1 : Unknown speakers speech is recorded first.

Step 2 : The starting and endpoint is detected and speech should go through the

filtering process.

Step 3 : Speech features are extracted from the speech signal which is used to create

the testing Vector (acoustic vector) for that utterances.

Step 4 : The testing vector is then fed into the vector quantizer

Step 5 : The predefined knowledge is used by the vector quantization to calculate the

spectral distortion(distance) for each utterance and smallest distance value is

selected.

Step 6 :

The smallest distance value is compared with a threshold value and a

decision of whether the unknown speaker to be recognized or not is

made[6].

An Automatic Speaker Recognition System

525

6 Result

In this work, the utterances of several speakers are taken and the data are divided in

music (rock and melody) and sample voice data (Bengali and English).Each sample

data is taken to train the Vector Quantizer and then all the utterances are used for

recognition or testing. The input of the VQ is obtained by the frequency analysis for

the given input utterances. The detail of the VQ is specified by representing the input

in the form of Matrix. We’ve taken about 70 data for which the result is shown in

percentage below.

Data Type

Correct

Recognition

False

Inclusion

False

Rejection

Music

86 %

6 %

8 %

Speech (English)

92 %

3 %

5 %

Speech (Bengali)

91 %

4 %

5 %

7 Conclusion

This paper deals with the automatic Speaker recognition system using Vector

Quantization. There are two main modules, feature extraction and feature matching.

The speaker specific features are extracted using Mel-Frequency Cepstrum Co-

efficient (MFCC) processor. A set of Mel-frequency cepstrum coefficients was found,

which are called acoustic vectors. These are the extracted features of the speakers.

These acoustic vectors are used in feature matching using vector quantization

technique. There are another techniques for feature matching such as Hidden Markov

model (HMM), Artificial Neural network (ANN) for speaker recognition. We’ve used

VQ as its computational complexity is less than others. Vector Quantization is the

typical feature matching technique in which VQ codebook is generated using trained

data. Finally tested data are provided for searching the nearest neighbor to match that

data with the trained data. The result is to recognize correctly the speakers where

music & speech data (Both in English & Bengali format) are taken for the recognition

process. The correct recognition is almost ninety percent. It is comparatively better

than Hidden Markov model (HMM) & Artificial Neural network (ANN) because the

correct recognition for HMM & ANN is below ninety percent.

The future work is to generate a VQ codebook with many pre-defined spectral

vectors. Then it will be possible to add many trained data in that codebook in a

training session, but the main problem is that the network size and training time

become prohibitively large with increasing data size. To overcome these limitations,

time alignment technique can be applied, so that continuous speaker recognition

system becomes possible.

Acknowledgement. This work was supported by grants to Shahjahan from KUET,

and to KM from the Japanese Society for the Promotion of Science, the Yazaki

Memorial Foundation for Science and Technology, and the University of Fukui.

526

P. Chakraborty et al.

References

1. Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood

Cliffs (1978)

2. Rabiner, L., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall, Englewood

Cliffs (1978)

3. Davis, S.B., Mermelstein, P.: Comparison of Parametric representations for monosyllabic

word recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustics

Speech, Signal Processing ASSP-28(4) (August 1980)

4. Buzo, L.A., Gray, R.: An Algorithm for Vector Quantizer Design. IEEE Transactions on

Communications 28, 84–95 (1980)

5. Furui, S.: Speaker Independent Isolated word Recognition using Dynamic Features of

Speech Spectrum. IEEE Transactions on Acoustic, and Speech Signal Processing ASSP-

34(1), 52–59 (1986)

6. Furui, S.: An Overview of Speaker Recognition Technology. In: ESCA Workshop on

Automatic Speaker Recognition, Identification & Verification, pp. 1–9 (1994)

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 527–536, 2008.

Modified Modulated Hebb-Oja Learning Rule:

A Method for Biologically Plausible Principal

Component Analysis

Marko Jankovic

, Pablo Martinez

, Zhe Chen

, and Andrzej Cichocki

Laboratory for Advanced Brain Signal Processing, RIKEN Brain Science

Institute, Wako-shi, Saitama, 351-0198, Japan

{elmarkoni,cia}@brain.riken.jp

Neuroscience Statistics Research Laboratory, Departement of Anesthesia

and Critical Care, Massachusetts General Hospital, Harvard Medical School,

Boston, MA 02114, USA

zhechen@neurostat.mgh.harvard.edu

Abstract. This paper presents Modified Modulated Hebb-Oja (MHO) method

that performs principal component analysis. Method is based on implementation

of Time-Oriented Hierarchical Method applied on recently proposed principal

subspace analysis rule called Modulated Hebb-Oja learning rule. Comparing to

some other well-known methods for principal component analysis, the proposed

method has one feature that could be seen as desirable from the biological point of

view – synaptic efficacy learning rule does not need the explicit information about

the value of the other efficacies to make individual efficacy modification.

Simplicity of the “neural circuits” that perform global computations and a fact that

their number does not depend on the number of input and output neurons, could

be seen as good features of the proposed method. The number of necessary global

calculation circuit is one. Some similarity to a part of the frog retinal circuit will

be suggested, too.

Keywords: Principal component analysis, time oriented hierarchy, Stiefel

submanifold, local learning rules.

1 Introduction

Neural networks provide a way for parallel on-line computations of the principal

component analysis (PCA) or principal subspace analysis (PSA). Due to their

parallelism and adaptivity to input data, such algorithms and their implementations in

neural networks are potentially useful in feature extraction and data compression. It is

well known that in the first step of any pattern recognition scheme, which is the

representation of the objects from a usually large amount of raw data, some

preprocessing and data compression are essential. In that case a minimal loss of

information is a central objective. Principal subspace analysis (PSA) and principal

component analysis (PCA) are powerful and popular tools that are used in statistical

signal processing and data compression and they have objective to reduce the

528

M. Jankovic et al.

complexity of input data and at the same time keep as much as possible of the input

data information.

Within last years various PCA and PSA learning algorithms have been proposed

and mathematically investigated (see [1-5], [8-15], [17-28]). Most of the proposed

algorithms are based on local Hebbian learning - due to locality it has been argued

that these algorithms are biologically plausible.

In [11], biologically inspired PSA methods, named Modulated Hebbian (MH) and

Modulated Hebb-Oja learning rules have been introduced. Major objectives for the

methods derivation were:

– to obtain a network which has a learning rule for individual synaptic efficacy that

requires the least possible amount of explicit information about the other synaptic

efficacies, especially those related to other neurons

– minimization of the neural hardware that is necessary for implementation of the

proposed learning rule.

In this paper modification of MHO algorithm, MMHO algorithm is analyzed.

MMHO algorithm performs PCA. New algorithm is obtained by implementation of

time-oriented hierarchical method (TOHM) [14] on MHO method. TOHM method is

general method that transforms PSA methods into PCA methods. It is based on

learning on approximate principal Grassman/Stiefel submanifold [14].

Section 2 reviews basic theory of modulated Hebb (MH) and modulated Hebb-Oja

(MHO) learning rules and suggests its computational circuit in the context of retinal

processing. Section 3 is devoted to the introduction of new MMHO learning rule.

Section 4 contains some simulation results. In Section 5 suggests speculative role of

the proposed method for early visual processing in part of the frog retinal circuit.

Section 6 gives some conclusion.

2 Modulated Hebb Learning Rule and Computational Circuit

2.1 Modulated Hebbain Rule: The Case N=K

Let x

∈

ℜ

denotes the input random variables with zero mean, and let y = W

T

x

∈

ℜ

N

denotes the output, where W

∈

ℜ

KxN

denotes the synaptic weight matrix. It is assumed

N=K. In scalar form, for the kth output, we have y

k

T

x (k = 1, …, N), where w

∈

ℜ

K

denotes the column vector of W.

Modulated Hebbian (MH) learning rule is introduced in [11] and analyzed in

[13][15]. Here we will give a little bit different interpretation. MH rule can be derived

as a gradient descent learning rule for minimization of the following cost function

(

)

(

) (

)

(

)

assumption

under

)

(

)

(

4

I

W

W

W

CWW

W

CW

W

C

x

WW

x

Wy

x

W

⎟

⎠

⎞

⎜

⎝

⎛

−

⎟

⎠

⎞

⎜

⎝

⎛

⎟

⎠

⎞

⎜

⎝

⎛

−

=

J

(1)

Modified Modulated Hebb-Oja Learning Rule

529

or in compact notation

)

(

⎪⎭

⎪

⎬

⎫

⎪⎩

⎪

⎨

⎧

⎟

⎠

⎞

⎜

⎝

⎛

−

∑

=

N

i

i

N

i

i

LoPCA

N

y

x

J

W

(2)

It leads to the following gradient descent algorithm:

(

)

(

)

(

)

(

)

(

)

(

)

(

i

i

i

y

i

x

i

i

i

N

k

k

N

k

k

y

x

W

W

⎟

⎠

⎞

⎜

⎝

⎛

−

∑

(3)

It is probably noticed that only an implicit constraint W

T

W=I is used - there is no

explicit constraint. As it is shown in [13], that this algorithm, named Modulated

Hebbian learning (MH), will lead toward a solution which insures W

T

W=I, although

no additional terms, that should keep weight vectors orthonormal, are added.

Also, it is interesting to notice that algorithm can be seen as a variant of the

moving threshold algorithm, which is a form similar to the BCM learning rule [3],

where the moving threshold is equal to the power contained in input signals.

From equation (3) we can see that the original Hebb learning rule modulated with a

single signal is applied. The network structure of interest is depicted on Fig. 1. The

proposed learning method posses the locality and homogeneity. These are desirable

properties for implementing artificial neural networks in parallel hardware [21].

Comparing to some other approaches (e.g. [19] and [26]) in this case we have

solution in which update learning rule for individual synaptic efficacy does not

require information about the explicit value of the other synaptic efficacies. Also, the

number of circuits that perform global calculations (in this case summators) is two,

regardless of the dimension of input and output vectors.

2.2 Modulated Hebb-Oja Rule: Case N < K

The MH learning rule is of low practical interest since the output has the same

dimensionality as the input (for instance it can be used for statistical orthogonalization

of a matrix). In the case when the number of output neurons is lower than the number

of input neurons, or if N, MH algorithm will produce orthogonal matrix W, but we

can't count any more on orthonormality of W. In order to maintain orthonormality of

the solution, some additional term in learning rule is needed.

One possibility is to use the approach which is proposed recently in [12]. However,

in that case equations become very complex. Other possibility is to use Oja's

stabilizing term applied separately on each weight vector, which is done in [15]. By

doing proposed, we have the following learning equation:

(
)

(

)
(

)

.
)

(

)
(

diag

)
(

)

(
)

(

)
(

)

(
)

(

)
(

)

(
)

(

)
1

(

T
T

T

T

i

i

i

i

i

i

i

i

i

i

i

i

y

y

W

y

x

y

y

x

x

W

W

−
−

+

=
+

γ

(4)

It is called Modulated Hebb Oja (MHO) learning rule. This learning rule generates

a solution W, whose range is equal to the subspace spanned by N biggest eigenvectors

of input signal covariance matrix C (C= E(tr(xx

T
)). Structure of the neural network of

530
M. Jankovic et al.

Fig. 1. MH(O) principal subspace network (small empty circle at the end of an arrow denotes

the square function)

interest, can also be presented by Fig. 1. The only difference is that some additional

calculations at the output are necessary. In [15] was shown that matrix W(i)  has

bounded elements W

kn

(i).

3 Introduction of the New PCA Learning Rule
Now it will be explained how we can transform MHO algorithm into PCA algorithm.

First we will recall the definition of Grassman and Stiefel manifold that can be found

in [7]. Grassman manifold is defined as the space of matrices W

∈
  O

KxN

⊂

ℜ

KxN

(N

≤

K) such that W

T

W=I and a homogeneous function J: O

KxN
→

ℜ

  such that

J(W)=J

(

WQ) for any NxN orthogonal matrix Q.

Stiefel manifold is defined as the space of matrices W

∈
O

KxN

⊂

ℜ

KxN

(N

≤

K) and a

function J: O

KxN

→

ℜ
.

Some learning algorithms can be seen as learning on Stiefel manifold if the weight

matrix is kept automatically orthonormal at each update step. Such algorithms are

known as strongly orthonormaly constrained (SOC) algorithms (e.g. [9]).

Here we will use recently proposed time oriented hierarchical method (TOHM) (or

more general GTOHM [14]) in order to transform PSA algorithm MHO into PCA

method. The method is based on the following idea, that

Each neuron tries to do what is the best for his family, and then what is the best for

itself.
We shall call this idea “the family principle”. In other words, the algorithm

consists of two parts: the first part is responsible for the family-desirable feature

learning and the second part is responsible for the individual-neuron-desirable feature

learning. The second part is taken with a weight coefficient which is, by absolute

y
N

y
2

y
1

x
K

x
2

x
1

Output y

Σ
Σ

Input x

Modified Modulated Hebb-Oja Learning Rule

531
value, smaller than 1. This means that some time-oriented hierarchy in realization of

the family and individual parts of the learning rules is made.

In order to realize “the family principle”, we propose the following class of

learning rules that can be used for parallel extraction of principal components, defined

by the following equation

,
IP

)

(

SPCAorSMCA

PSA
PCA

i

D

W

W

+
Δ

=

Δ

(5)

where

Δ

W

PCA
represents modification of the weights,

Δ

W

PSA

defines family part of the

learning rule (that is PSA), IP

SPCAorSMCA

represent individual part of the learning rule

(single unit PCA or MCA algorithm) and D(i) is diagonal matrix which diagonal

elements can be functions of time (in the case of homogeneous algorithm D(i, i) =

α

).

In the most general case all individual parts can be implemented as different learning
rules. It is interesting to note that individual part can pursue minor component while

the whole algorithm pursue principal components.

The intuition is as follows: we use two times scales; so if

α
is sufficiently small,

then the term multiplied by

α

does not affect the PSA learning direction. When

algorithm reaches the principal subspace, then part of the algorithm that is multiplied

by

α
will perform learning on approximately Grasmman/Stiefel principal submanifold

(definitions are given in [14]), and will rotate weight vectors toward principal

components.

By implementation of the proposed principle new learning rule can be written in

the form

( ) ( )
( ) ( )

(

)
( )

( )

(
)

( ) ( )

( ) ( )

(
)
( )

( )

.
,..,

1

,
)

(

)
(

)

(
)

1

(
1

2

2
2

2

2
2

N

k

i

y

i

i

y

i

i

y

i

i

i

i

i

y

i

i

y

i

i

i

i

k

j

j

k

k

k

k

k

k

k

k

=
⎟

⎟

⎠
⎞

⎜

⎜
⎝

⎛

−
−

+

−
−

+

=
+

∑

=

x

w

x

y

x

w

x

w

w
αγ
γ

(6)

We can see that we actually have a system of equations that have same family part
of the learning rule, and all individual parts of the learning rules are different. By

implementation of the same method that was proposed in [15], it can be shown that

synaptic vectors have bounded norms under some reasonable assumptions. It can be

proved that the stable points of the algorithm are principal eigenvectors of the input

signal covariance matrix. However, the proof is lengthy and will be omitted here. The

sketch of the proof is as follows:

–  difference equations (6) can be related to their differential counterparts by

implementation of stochastic approximation [16], [20];

–  it is possible to show that all equations actually represent eigenvector equations

for the set of matrices that have eigenvectors equal to eigenvectors of the input

signal covariance matrix C= E(tr(xx

T
);

–  if the weight matrix is full rank for all t, then columns of the synaptic matrix

will converge toward some of the principal eigenvectors of the input covariance

matrix;

–  then, using the method proposed in [15] it is possible to prove that those

eigenvectors will correspond to principal eigenvectors of the input covariance

matrix.

532
M. Jankovic et al.

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 46 47 48 49 50 51 52 53 ... 88