Lecture Notes in Computer Science

Neural Network Based Boolean Factor Analysis

bet	79/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 75 76 77 78 79 80 81 82 ... 88

Fig. 2.
Acknowledgment.

2.4

Neural Network Based Boolean Factor Analysis

The NBFA is a powerful method for revealing the information redundancy of high di-

mensional binary signals [7]. It allows to express every signal (vector of variables) from

binary data matrix X of observations as superposition of binary factors:

X =

L

l=1

(4)

where S

is a component of factor scores and f

is a vector of factor loadings and

∨

denotes Boolean summation (0

∨ 0 = 0, 1 ∨ 0 = 1, 0 ∨ 1 = 1, 1 ∨ 1 = 1). If we mark

Boolean matrix multiplication by the symbol

, then we can express approximation of

data matrix X in matrix notation

k

F

(5)

where S is the matrix of factor scores and F is the matrix of factor loadings. The Boolean

factor analysis implies that components of original signals, factor loadings and factor

scores are binary values.

Optimal solution of X

decomposition according 5 by brute force search is NP-hard

problem and as such is not suitable for high dimensional data. On other side the classical

linear methods could not take into account non-linearity of Boolean summation and

therefore are inadequate for this task.

The NBFA is based on Hopﬁeld-like neural network [12,13]. Used is the fully con-

nected network of N neurons with binary activity (1 - active, 0 - nonactive). Each pat-

tern of the learning set X

is stored in the matrix of synaptic connections J according

to Hebbian rule:

m=1

− q

)(X

− q

), i, j = 1, ..., N, i=j, J

= 0

(6)

where M is the number of patterns in the learning set and bias q

=

N

i=1

the total relative activity of the m-th pattern. This form of bias corresponds to the bio-

logically plausible global inhibition being proportional to overall neuron activity. One

866

V. Snášel et al.

special inhibitory neuron was added to N principal neurons of the Hopﬁeld network.

The neuron was activated during the presentation of every pattern of the learning set and

was connected with all the principal neurons by bidirectional connections. Patterns of

the learning set are stored in the vector j of the connections according to the Hebbian

rule:

j

i

m=1

− q

) = M (q

− q), i = 1..N,

(7)

where q

m=1

is a mean activity of the i-th neuron in the learning set and q is

a mean activity of all neurons in the learning set. We also supposed that the excitability

of the introduced inhibitory neuron decreases inversely proportional to the size of the

learning set, being 1/M after all patterns are stored. In the recall stage its activity is

then:

A(t) = (1/M )

i=1

(t) = (1/M )j

X(t)

where j

is transposed j . The inhibition produced in all principal neurons of the net-

work is given by vector j A(t) = (1/M )j j

X(t).

Thus, the inhibition is equivalent

to the subtraction of

J = j j

T

/M = M qq

(8)

from J where q is a vector with components q

− q. Adding the inhibitory neuron is

equivalent to replacing the ordinary connection matrix J by the matrix J = J

− J .

To reveal factors we suggest the following two-run recall procedure. Its initialization

starts by the presentation of a random initial pattern X

with k

= r

active neu-

rons. Activity k

is supposed to be smaller than the activity of any factor. On presenta-

tion of X

, network activity X evolves to some attractor. This evolution is determined

by the synchronous discrete time dynamics. At each time step:

(t + 1) = Θ(h

(t)

− T (t)), i = 1, · · ·, N,

(0) = X

(9)

where h

are components of the vector of synaptic excitations

h(t) = JX(t),

(10)

is the step function, and T (t) is an activation threshold.

At each time step of the recall process the threshold T (t) was chosen in such a way

that the level of the network activity was kept constant and equal to k

. Thus, on each

time step k

“winners" (neurons with the greatest synaptic excitation) were chosen

and only they were active on the next time step. As shown in [12], this choice of ac-

tivation threshold enables the network activity to stabilize in point or cyclic attractors

Pattern Discovery for High-Dimensional Binary Datasets

867

of length two. The ﬁxed level of activity at this stage of the recall process could be

ensured by biologically plausible non-linear negative feed-back control accomplished

by the inhibitory interneurons. It is worth to note that although the fact of convergence

of network synchronous dynamics to point or cyclic attractors of length two was estab-

lished earlier [14], it was done for ﬁxed activation threshold T but for ﬁxed network

activity it was done ﬁrst in [12].

When activity stabilizes at the initial level of activity k

, k

+ 1

neurons with

maximal synaptic excitation are chosen for the next iteration step, and network activity

evolves to an attractor at the new level of activity k

+ 1

. The level of activity then

increases to k

+ 2

, and so on, until the number of active neurons reaches the ﬁnal

level k

= r

where r = k/N is a relative network activity. Thus, one trial of the

recall procedure contains (r

− r

external steps and several internal steps (usually

2-3) inside each external step to reach an attractor for a given level of activity.

At the end of each external step when network activity stabilizes at the level of k

active neurons a Lyapunov function was calculated by formula:

λ = X

(t + 1)JX(t)/k,

(11)

where X

(t+1)

and X(t) are two network states in a cyclic attractor (for point attractor

(t + 1) = X(t)

). The identiﬁcation of factors along the trajectories of the network

dynamics was based on the analysis of the change of the Lyapunov function and the

activation threshold along each trajectory. In our deﬁnition of Lyapunov function its

value gives a mean synaptic excitation of neurons belonging to an attractor at the end

of each external step.

2.5

Statistical Clustering Methods

The clustering methods help to reveal groups of black pixels, it means typical parts of

images. However, obtaining disjunctive clusters is a problem of the usage of traditional

hard clustering.

The HAA algorithm starts with each pixel in a group of its own. Then it merges

clusters until only one large cluster remains which includes all pixels. The user must

choose dissimilarity or similarity measure and agglomerative procedure. At the begin-

ning, when each pixel represents its own cluster, the dissimilarity between two pixels

is deﬁned by the chosen dissimilarity measure. However, once several pixels have been

linked together, we need a linkage or amalgamation rule to determine when two clusters

are sufﬁciently similar to be linked together. Several linkage rules have been proposed,

i.e. the distance between two different clusters can be determined by the greatest dis-

tance between two pixels in the clusters (Complete Linkage method - CL), or average

distance between all pairs of objects in the two clusters (Average Linkage Between

Groups - ALBG). Hierarchical clustering is based on the proximity matrix (dissimilari-

ties for all pairs of pixels) and it is independent on the order of pixels. For binary data,

we can choose for example Jaccard and Ochiai (cosine) similarity measures of two pix-

els. The former can be expressed as S

=

a

a+b+c

where a is the number of the common

868

V. Snášel et al.

occurrences of ones and b + c is the number of pairs in which one value is one and the

second is zero. The latter can be expressed as S

=

a

a+b

a+c

In the TSCA algorithm, the pixels are arranged into subclusters, known as "cluster-

features". These cluster-features are then clustered into k groups, using a traditional

hierarchical clustering procedure. A cluster feature (CF) represents a set of summary

statistics on a subset of the data. The algorithm consists of two phases. In the ﬁrst one,

an initial CF tree is built (a multi-level compression of the data that tries to preserve

the inherent clustering structure of the data). In the second one, an arbitrary clustering

algorithm is used to cluster the leaf nodes of the CF tree. Disadvantage of this method is

its sensitivity to the order of the objects (pixels in our case). The log-likelihood distance

measure can be used for calculation of the distance between two clusters.

3

Experimental Results

For testing of above mentioned methods, we used generic collection of 1600 32

× 32

black-and-white images containing different combinations of horizontal and vertical

lines (bars). The probabilities of bars to occur in images were the same and equal to

10/64

, i.e. images contain 10 bars in average. An example of several images from

generated collection is shown in Figure 3a.

For the ﬁrst view on image structures, we applied traditional cluster analysis. We

clustered 1024 (32

× 32) positions into 64 and 32 clusters. The problem of the use

of traditional cluster analysis consists in that we can obtain disjunctive clusters which

means we can ﬁnd only horizontal bars and parts of vertical bars and vice versa. We

applied HAA and TSCA clustering as implemented in the SPSS system. The problem

of TSCA method consists in that it is dependent on the order of analyzed images, so we

used two different orders.

For HAA, we tried to use different similarity measures. We found that linkage meth-

ods have more inﬂuence to the results of clustering than similarity measures. We used

Jaccard and Ochiai (cosine) similarity measures suitable for asymmetric binary at-

tributes. We found both as suitable methods for the identiﬁcation of the bars or their

parts. For 64 clusters, the differences were only in a few assignments of positions by

ALBG and CL methods with Jaccard and Ochiai measures.

The following ﬁgures illustrate the application of some of these techniques for 64 and

clusters. Figure 2a,b show results of ALBG method with Jaccard measure Figure 2a

for 32 clusters and Figure 2b for 64 clusters. In the case of 32 clusters, we found 32

horizontal bars (see Figure 2c ) by TSCA method for the second order of features.

Many of tested methods were able to generate a set of base images or factors, which

should ideally record all possible bar positions. However, not all methods were truly

successful in this.

With the SVD, we obtain classic singular vectors, the most general being among the

ﬁrst. The ﬁrst few are shown in Figure 3b. We can see, that the bars are not separated

and different shades of gray appear.

The NMF methods yield different results. The original NMF method, based on

the adjustment of random matrices W and H, provides hardly-recognizable images

even for k = 100 and 1000 iterations (we used 100 iterations for other experiments).

Pattern Discovery for High-Dimensional Binary Datasets

869

(a)

(b)

of pixels by TSCA method

(a)

(b)

Moreover, these base images still contain signiﬁcant salt and pepper noise and have

a bad contrast. The factors are shown in Figure 3c. We must also note, that the NMF

decomposition will yield slightly different results each time it is run, because the ma-

trix(es) are pre-ﬁlled with random values.

The GD-CLS modiﬁcation of NMF method tries to improve the decomposition by

calculating the constrained least squares problem. This leads to a better overall quality,

however, the decomposition really depends on the pre-ﬁlled random matrix H. The

result is shown in Figure 4a.

The SDD method differs slightly from previous methods, since each factor contains

only values

{−1, 0, 1}. Gray in the factors shown in Figure 4b represents 0; −1 and 1

are represented with black and white respectively. The base vectors in Figure 4b can

be divided into three categories: Base vectors containing only one bar. Base vectors

containing one horizontal and one vertical bar. Other base vectors, containing several

bars and in some cases even noise.

Finally, we made decomposition of images into binary vectors by the NBFA method.

Here factors contain only values

{0, 1} and we use Boolean arithmetics. The factor

search was performed under assumption that the number of ones in factor is not less

870

V. Snášel et al.

(a)

(b)

and (c) NBFA method

than 5 and not greater than 200. Since the images are obtained by Boolean summation

of binary bars, it is not surprising that the NBFA is able to reconstruct all bars as base

vectors, providing an ideal solution, as we can see in Figure 4. It is clear that classical

linear methods could not take into account non-linearity of Boolean summation and

therefore are inadequate for bars problem task. But the linear methods are fast and well

elaborated so it was very interesting to compare linear approach with the NBFA and

compare qualitatively the results.

4

Conclusion

In this paper, we have compared NBFA and several dimension reduction attempts -

icluding clustering methods - on so called bars problem.

It is shown that NBFA perfectly found basis (factors) from which the all learning

pictures can be reconstructed. First, it is because the model, on which the BFA is based,

is the same as used for data generation. Secondly, because of robustness of BFA im-

plementation based on recurrent neural network. Some experiments show, that the re-

sistance against noise is very high. We hypothesize that it is due the self reconstruction

ability of our neural network.

Whilst the SVD is known to provide quality eigenfaces, it is computationally expen-

sive and in case we only need to beat the "curse of dimensionality" by reducing the

dimension, SDD may sufﬁce. As expected, from methods which allowed direct query-

ing in reduced space, SVD was the slower, but most exact method.

The NMF and SDD methods may be also used, but not with the L

metric, since the

distances are not preserved enough in this case.

There are some other newly-proposed methods, which may be interesting for fu-

ture testing, e.g. the SparseMap [15]. Additionally, faster pivot selection technique for

FastMap [16] may be considered. Finally, testing of used dimension reduction methods

with deviation metrics on metric structures should answer the question of projected data

indexability (which is poor for SVD-reduced data).

Pattern Discovery for High-Dimensional Binary Datasets

871

Cluster analysis in this application is focused on ﬁnding original factors from which

images were generated. Applied clustering methods were quite successful in ﬁnding

these factors. The problem of cluster analysis is, that it provides disjunctive clusters

only. So, only some bars or their parts were revealed. However, from the general view

on 64 clusters, it is obvious, that images are compounded from vertical and horizontal

bars (lines). By two-step cluster analysis, 32 horizontal lines were revealed by clustering

to 32 clusters.

Acknowledgment. The work was partly funded by the Centre of Applied Cybernetics

1M6840070004 and partly by the Institutional Research Plan AV0Z10300504 "Com-

puter Science for the Information Society: Models, Algorithms, Appplications" and by

the project 1ET100300419 of the Program Information Society of the Thematic Pro-

gram II of the National Research Program of the Czech Republic.

References

1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In:

VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases,

pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994)

2. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication

rules for market basket data. In: SIGMOD 1997: Proceedings of the 1997 ACM SIGMOD

international conference on Management of data, pp. 255–264. ACM Press, New York (1997)

3. Spellman, P.T., Sherlock, G., Zhang, M.Q., Anders, V.I.K., Eisen, M.B., Brown, P., Botstein,

D., Futcher, B.: Comprehensive identiﬁcation of cell cycle-regulated genes of the yeast sac-

charomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9, 3273–

3297 (1998)

4. Koyutürk, M., Grama, A., Ramakrishnan, N.: Nonorthogonal decomposition of binary ma-

trices for bounded-error data compression and analysis. ACM Trans. Math. Softw. 32(1),

33–69 (2006)

5. Földiák., P.: Forming sparse representations by local anti-Hebbian learning. Biological cy-

bernetics 64(22), 165–170 (1990)

6. Frolov, A., Húsek, D., Polyakov, P., ˇ

Rezanková, H.: New Neural Network Based Approach

Helps to Discover Hidden Russian Parliament Votting Paterns. In: International Joint Con-

ference on Neural Networks, Omnipress, pp. 6518–6523 (2006)

7. Frolov, A.A., Húsek, D., Muravjev, P., Polyakov, P.: Boolean Factor Analysis by Attractor

Neural Network. Neural Networks, IEEE Transactions 18(3), 698–707 (2007)

8. Berry, M., Dumais, S., Letsche, T.: Computational Methods for Intelligent Information Ac-

cess. In: Proceedings of the 1995 ACM/IEEE Supercomputing Conference, San Diego, Cal-

ifornia, USA (1995)

9. Kolda, T.G., O’Leary, D.P.: Computation and uses of the semidiscrete matrix decomposition.

In: ACM Transactions on Information Processing (2000)

10. Shahnaz, F., Berry, M., Pauca, P., Plemmons, R.: Document clustering using nonnegative ma-

trix factorization. Journal on Information Processing and Management 42, 373–386 (2006)

11. Spratling, M.W.: Learning Image Components for Object Recognition. Journal of Machine

Learning Research 7, 793–815 (2006)

12. Frolov, A.A., Húsek, D., Muravjev, P.: Informational efﬁciency of sparsely encoded

Hopﬁeld-like autoassociative memory. Optical Memory and Neural Networks (Information

Optics), 177–198 (2003)

872

V. Snášel et al.

13. Frolov, A.A., Sirota, A.M., Húsek, D., Muravjev, P.: Binary factorization in Hopﬁeld-like

neural networks: single-step approximation and computer simulations. Neural Networks

World, 139–152 (2004)

14. Goles-Chacc, E., Fogelman-Soulie, F.: Decreasing energy functions as a tool for studying

threshold networks. Discrete Mathematics, 261–277 (1985)

15. Faloutsos, C.: Gray Codes for Partial Match and Range Queries. IEEE Transactions on Soft-

ware Engineering 14(10) (1988)

16. Faloutsos, C., Lin, K.: FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualiza-

tion of Traditional and Multimedia Datasets. ACM SIGMOD Record 24(2), 163–174 (1995)

Expand-and-Reduce Algorithm of

Particle Swarm Optimization

Eiji Miyagawa and Toshimichi Saito

Hosei University, Koganei, Tokyo, 184-0002 Japan

Abstract. This paper presents an optimization algorithm: particle

swarm optimization with expand-and-reduce ability. When particles are

trapped into a local optimal solution, a new particle is added and the

trapped particle(s) can escape from the trap. The deletion of the particle

is also used in order to suppress excessive network grows. The algorithm

eﬃciency is veriﬁed through basic numerical experiments.

Introduction

Particle swarm optimization (PSO) has been used extensively as an eﬀective

technique for solving a variety of optimization problems [1] - [6]. It goes with-

out saying that it is hard to ﬁnd a real optimal solution in practical complex

problems. The PSO algorithm tries to ﬁnd the solution with enough accuracy in

practical cases. The original PSO was developed by Kennedy and Eberhart [1].

The PSO shares best information in the swarm and is used for the function op-

timization problem. In order to realize eﬀective PSO search, there exist several

interesting approaches including particle dynamics by random sampling from

a ﬂexible distribution [3], an improved PSO by speciation [4], growing multiple

subswarms [5] and adaptive PSO [6]. The prospective applications of the PSO are

many, including iterated prisoner’s dilemma, optimizing RNA secondary struc-

ture, mobile sensor networks and nonlinear State estimation [7] - [10]. However,

the basic PSO may not ﬁnd a desired solution in complex problems. For example,

if one particle ﬁnd a local optimal solution, the entire swarm may be trapped

into it. Once the swam is trapped, it is diﬃcult to escape from the trap.

This paper presents a novel version of the PSO: PSO with expand-and-reduce

ability (ERSPO). When particles are trapped into a local optimal solution, a

new particle is added and the particle swarm can grow. In a swarm, particles

share information of added particles. The trapped particle(s) can escape from

the trap provided parameters are selected suitably. The deletion of particle(s) is

also introduced in order to suppress excessive swarm grows. If the deletion does

not present, the swarm goes excessively and computation overload is caused.

Performing basic numerical experiments, eﬀectiveness of the ERSPO algorithm

is veriﬁed.

It should be noted that the growing and/or reducing of swarms have been

key technique in several learning algorithms including self-organizing maps and

practical applications [11].

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 873–881, 2008.

c Springer-Verlag Berlin Heidelberg 2008

874

E. Miyagawa and T. Saito

Fundamental Version of PSO

As preparation, we introduce the global best version of the PSO with an elemen-

tal numerical experiment. The PSO is an optimization algorithm where particles

within the swarm learn from each other, and move to become more similar to

their “better” neighbors. The social structure for the PSO is determined through

the formation of neighborhood to communicate. The global best (gbest) version

is fundamental where each particle can communicate with every other particle,

forming a fully connected social network. The i-th particle at time t is charac-

terized by its position x

(t) and the position is updated based on its velocity

(t)”. We summarize the gbest version for ﬁnding global minimum of a function

F where dimension of F and x

are the same:

Step 1: Initialization. Let t = 0. The particle position x

(t) is located randomly

in the swarm S(t) where i = 1

∼ N and N is the number of particles.

Step 2: Compare the value of F of each particle to its best value (pbest

) so far:

if F (x

(t)) < pbest

then (a) pbest

= F (x

(t)) and (b) x

pbest

i

= x

(t).

Step 3: Compare the value of F of each particle to the global best value: if

F (x

(t)) < gbest then (a) gbest = F (x

(t)) and (b) x

gbest

= x

(t).

Fig. 1. Search process by the gbest version. The dot denotes a particle. N = 10 and

max

= 50. ρ

= ρ

is uniform random number on [0, 2].

Expand-and-Reduce Algorithm of Particle Swarm Optimization

875

Step 4: Change the velocity vector for each particle:

(t) = W v

(t) + ρ

pbest

− x

(t)) + ρ

gbest

− x

(t))

(1)

where W is the inertial term deﬁned by

W = W

max

−

max

−W

min

max

× t.

(2)

and ρ

are random variables.

Step 5: Move each particle to a new position: x

(t) = x

(t) + v

(t).

Step 6: Let t = t + 1. Go to Step 2, and repeat until t = t

max

In order to demonstrate performance of the PSO, we apply the gbest algorithm

to the following elemental problem:

minimize

, x

) = x

+ x

subject to

| ≤ 50, |x

| ≤ 50

(3)

has the unique extremum at the origin (x

, x

) = (0, 0) that is the optimal

solution. Fig. 1 illustrates learning process of particles in the gbeat version. We

can see that the entire swarm converge to the optimal solution.

New algorithm is an improvement of this algorithm: the network has ring

structure, the number of swarm is time-variant, and the global best value is

replaced with the local best value within the closest neighbors. The detailed

deﬁnition is given in Section 3.

Expand-and-Reduce Algorithm

The fundamental PSO can ﬁnd the optimal solution in simple problems as shown

in the previous section. However, if the object functions has complex shape with

many local extrema, entire swarm may be trapped into a local optimal solution.

The velocity of each particle is changed by the random variables ρ

and ρ

however, it is very hard to adjust the parameter depending on various problems.

Here we present a novel type of PSO where the swarm can be expanded

and reduced. In this expand-and-reduce particle swarm optimization (ERPSO),

the particle can escape from the trap without adjusting the parameter. In the

ERPSO, the i-th particle at time t is characterized by its position x

(t) and

counter c

(t) that counts time-invariance of pbest

. The closest neighbors in the

ring structure is used to update particles and the number of particles N is a

variable depending on the value of c

(t). We deﬁne the ERPSO for the swarm

with ring structure and for ﬁnding problem of global minimum of a function F .

Step 1: Initialization. Let t = 0 and the number of particles N and the value

of the counter c

(

t) is initialized. The position x

(t) is located randomly in the

swarm.

876

E. Miyagawa and T. Saito

Fig. 2. Expansion of PSO and escape from a trap

Step 2: Compare the cost of each particle with its best cost so far: if F (x

(t)) <

pbest

then (a) pbest

= F (x

(t)) and (b) x

pbest

i

= x

(t) The counter value

increases if improvement of pbest

is not suﬃcient:

(t) = c

(t) + 1

|F (x

(t))

− pbest

| <

(4)

where

is a small value.

Step 3 (expansion): If a particle is trapped into local optimal or global optimal

as shown in Fig. 2 (a) and (b), the number of c

is large. If the counter value

exceeds a threshold, a new particle is inserted into the closest neighbor of the

trapped particle and is located away from the trap as shown in Fig. 2 (c) and

(d). In the case where the i-th particle is trapped,

new

(t) = x

(t) + r

if c

(t) > T

int

(5)

where r is a random variable and the suﬃx is reassigned: j = j + 1 for i < j and

new

= x

i+1

. Let N = N + 1. Let the counter be initialized, c

(t) = 0 for all i.

This expansion can be eﬀective to escape from the trap.

Step 4 (reduction): If the value of gbest(t) does not change during some time

interval started from time when a new particle is inserted, one of trapped par-

ticles is removed. The reduction aims at suppression of excessive growing that

causes unnecessary computation time.

Expand-and-Reduce Algorithm of Particle Swarm Optimization

877

Step 5: Compare the cost of each particle with its local best lbset cost so far:

if F (x

(t)) < lbest

then (a) lbest

= F (x

(t)) and (b) x

lbest

i

= x

(t) where the

lbest is given for the both closest neighbors of the i-th particle.

Step 6: Change the velocity vector for each particle:

i

(t) = W v

(t) + ρ

pbest

− x

(t)) + ρ

lbest

− x

(t))

(6)

where W is the inertial term deﬁned by Eq. 2 and ρ

and ρ

are random variables.

f

f

Fig. 3. Searching the minimum value of Equation (3). The red line is the solution.

(a) lbest version without expand-and-reduce for N = 7. (b) ERPSO, N (0) = 3. The

parameter values: T

int

= 30, W

max

= 0.9, W

min

= 0.4, t

max

= 750,

= 10

−3

. The

random variables ρ

= ρ

are given by uniform distribution on [0, 2]. The time average

of N (t) is 7.

878

E. Miyagawa and T. Saito

Step 7: Move each particle to a new position: x

(t) = x

(t) + v

(t).

Step 8: Let t = t + 1. Go to Step 2, and repeat until t = t

max

Numerical Experiments

In order to investigate eﬃciency of the ERPSO, we have performed basic nu-

merical experiments.

First, we have applied the ERPSO to the problem deﬁned by Equation (3).

The result is shown in Fig. 3 where the lbest version is deﬁned by Step 1 to

Step 7 without Steps 3 and 4. This result suggest that the ERPSO can ﬁnd

the optimal solution a little bit faster than the lbest version. The eﬃciency of

ERPSO seems to be remarkable in more complicated problem.

Optimal Solution

Optimal Solution

f

tt = 00

tt = 250

250

tt = 500

500

Local Optimal Solution

Local Optimal Solution

Fig. 4. Expand-and-reduce Search process in f

(x

1

, x

). Parameter values are deﬁned

in Fig. 5 caption.

Expand-and-Reduce Algorithm of Particle Swarm Optimization

879

f

f

Fig. 5. Searching the minimum value of Equation (7). The red line is the solution. (a)

is lbest version without expand-and-reduce, N = 9 that is the time average of N (t)

in the ERPSO. (b) is ERPSO, N (0) = 3. The parameter values: T

int

= 30, W

max

0.9, W

min

= 0.4, t

max

= 750,

= 10

−3

. The random variables ρ

= ρ

is used uniform

distribution on [0, 2].

Second, we have applied the ERPSO to a multi-extrema function: the modiﬁed

Shekel’s Foxholes function studied in [12].

minimize

, x

) =

−

i=1

− α

)

+ (x

− β

)

+ γ

subject to

≤ x

≤ 10, 0 ≤ x

≤ 10

(7)

880

E. Miyagawa and T. Saito

Table 1. Success rate for 1,000 trials. Parameters are as in Fig. 5.

Function Algorithm Success rate

1

PSO

100 %

ERPSO

100 %

PSO

37.5 %

ERPSO

56.2 %

(α

· · · , α

) = (9.681, 9.400, 8.025, 2.196, 8.074, 7.650, 1.256, 8.314, 0.226, 7.305)

(α

· · · , α

) = (0.652, 2.699, 8.327, 2.132, 4.707, 8.304, 8.632, 4.887, 2.440, 6.306)

(α

· · · , α

) = (0.652, 5.558, 3.352, 8.798, 1.460, 0.432, 0.679, 4.263, 9.496, 4.138)

(β

· · · , β

) = (0.667, 2.041, 9.152, 0.415, 8.777, 5.658, 3.605, 2.261, 8.858, 2.228)

(β

· · · , β

) = (7.027, 3.516, 3.897, 7.006, 5.579, 7.559, 4.409, 9.112, 6.686, 8.583)

(β

· · · , β

) = (2.343, 1.272, 7.549, 0.880, 8.057, 8.645, 2.800, 1.074, 4.830, 2.562)

(γ

· · · , γ

) = (0.806, 0.517, 0.100, 0.908, 0.965, 0.669, 0.524, 0.902, 0.531, 0.876)

(γ

· · · , γ

) = (0.462, 0.491, 0.463, 0.714, 0.352, 0.869, 0.813, 0.811, 0.828, 0.964)

(γ

· · · , γ

) = (0.789, 0.360, 0.369, 0.992, 0.332, 0.817, 0.632, 0.883, 0.608, 0.326)

This function has a lot of local optimal solutions as shown in Fig. 4(a). The op-

timal solution is f

(x

1

, x

) =

−12.12 at (x

, x

) = (8.02, 9.14). A search process

is illustrated in Fig. 5: we can see that particles can escape from the trap and

goes to the optimal solution. We have conﬁrmed that this optimal solution is

hard to search by the fundamental gbast version.

Fig. 5 and Table. 1 shows the result of ERPSO with that of the lbest version.

In the results, the lbest version is hard to ﬁnd the solution but the ERPSO can

ﬁnd it. In the lbest version, the particles are trapped into some local minimum

almost always.

Conclusions

A novel version of PSO is presented in this paper. In the algorithm, the expanding

function is eﬀective for escape from a trap and the reducing function is eﬀective

to suppress computation cost. The algorithm eﬃciency is suggested through

basic numerical experiment. This paper is a ﬁrst step and has many problems

including the following:

1) analysis of role of parameters,

2) analysis of eﬀect of topology of particle network,

3) automatic adjustment of parameters,

4) application to more complicated benchmarks, and

5) application to practical problems.

References

1. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: Proc of IEEE/ICNN,

pp. 1942–1948 (1995)

2. Engelbrecht, A.P.: Computational Intelligence, an introduction, pp. 185–198.

Wiley, Chichester (2004)

Expand-and-Reduce Algorithm of Particle Swarm Optimization

881

3. Richer, T.J., Blackwell, T.M.: The Levy Particle Swarm. In: Proc. Congr. Evol.

Comput., pp. 3150–3157 (2006)

4. Parrott, D., Li, X.: Locating and tracking multiple dynamic optima by a particle

swarm model using speciation. IEEE Trans. Evol. Comput. 10(4), 440–458 (2006)

5. Brits, R., Engelbrecht, A.P., van den Bergh, F.: A Niching Particle Swarm Opti-

mizer. In: Proc. of SEAL, vol. 1079 (2002)

6. Hu, X., Eberhart, R.C.: Adaptive Particle Swarm Optimization: Detection and

Response to Dynamic Systems. In: Proc. of IEEE/CEC, pp. 1666–1670 (2002)

7. Franken, N., Engelbrecht, A.P.: Particle swarm optimization approaches to co-

evolve strategies for the Iterated Prisoner’s Dilemma. IEEE Trans. Evol. Com-

put. 9(6), 562–579 (2005)

8. Neethling, M., Engelbrecht, A.P.: Determining RNA secondary structure using set-

based particle swarm optimization. In: Proc. Congr. Evol. Comput., pp. 6134–6141

(2006)

9. Jatmiko, W., Sekiyama, K., Fukuda, T.: A PSO-based mobile sensor network for

odor source localization in dynamic environment: theory, simulation and measure-

ment. In: Proc. Congr. Evol. Comput., pp. 3781–3788 (2006)

10. Tong, G., Fang, Z., Xu, X.: A particle swarm optimized particle ﬁlter for nonlinear

system state estimation. In: Proc. Congr. Evol. Comput., pp. 1545–1549 (2006)

11. Oshime, T., Saito, T., Torikai, H.: ART-based parallel learning of growing SOMs

and its application to TSP. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.)

ICONIP 2006. LNCS, vol. 4232, pp. 1004–1011. Springer, Heidelberg (2006)

12. Bersini, H., Dorigo, M., Langerman, S., Geront, G., Gambardella, L.: Results of

the ﬁrst international contest on evolutionary optimisation (1st iceo). In: Proc. of

IEEE/ICEC, pp. 611–615 (1996)

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 882–891, 2008.

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 75 76 77 78 79 80 81 82 ... 88