Lecture Notes in Computer Science
Neural Network Based Boolean Factor Analysis
Download 12.42 Mb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- Fig. 2.
- Acknowledgment.
2.4 Neural Network Based Boolean Factor Analysis The NBFA is a powerful method for revealing the information redundancy of high di- mensional binary signals [7]. It allows to express every signal (vector of variables) from binary data matrix X of observations as superposition of binary factors: X = L
S l f l , (4) where S l is a component of factor scores and f l is a vector of factor loadings and ∨ denotes Boolean summation (0 ∨ 0 = 0, 1 ∨ 0 = 1, 0 ∨ 1 = 1, 1 ∨ 1 = 1). If we mark Boolean matrix multiplication by the symbol , then we can express approximation of data matrix X in matrix notation X k
S (5)
where S is the matrix of factor scores and F is the matrix of factor loadings. The Boolean factor analysis implies that components of original signals, factor loadings and factor scores are binary values. Optimal solution of X k decomposition according 5 by brute force search is NP-hard problem and as such is not suitable for high dimensional data. On other side the classical linear methods could not take into account non-linearity of Boolean summation and therefore are inadequate for this task. The NBFA is based on Hopfield-like neural network [12,13]. Used is the fully con- nected network of N neurons with binary activity (1 - active, 0 - nonactive). Each pat- tern of the learning set X m is stored in the matrix of synaptic connections J according to Hebbian rule: J ij = M m=1 (X m i − q m )(X m j − q m ), i, j = 1, ..., N, i=j, J ii = 0
(6) where M is the number of patterns in the learning set and bias q m =
i=1 X m i /N is the total relative activity of the m-th pattern. This form of bias corresponds to the bio- logically plausible global inhibition being proportional to overall neuron activity. One 866 V. Snášel et al. special inhibitory neuron was added to N principal neurons of the Hopfield network. The neuron was activated during the presentation of every pattern of the learning set and was connected with all the principal neurons by bidirectional connections. Patterns of the learning set are stored in the vector j of the connections according to the Hebbian rule: j
= M m=1 (X m i − q m ) = M (q i − q), i = 1..N, (7) where q
i = M m=1 X m i /M is a mean activity of the i-th neuron in the learning set and q is a mean activity of all neurons in the learning set. We also supposed that the excitability of the introduced inhibitory neuron decreases inversely proportional to the size of the learning set, being 1/M after all patterns are stored. In the recall stage its activity is then:
A(t) = (1/M ) N i=1 j i X i (t) = (1/M )j T X(t)
where j T is transposed j . The inhibition produced in all principal neurons of the net- work is given by vector j A(t) = (1/M )j j T X(t). Thus, the inhibition is equivalent to the subtraction of J = j j T
T (8)
from J where q is a vector with components q i − q. Adding the inhibitory neuron is equivalent to replacing the ordinary connection matrix J by the matrix J = J − J .
To reveal factors we suggest the following two-run recall procedure. Its initialization starts by the presentation of a random initial pattern X in with k
in = r
in N active neu- rons. Activity k in is supposed to be smaller than the activity of any factor. On presenta- tion of X in , network activity X evolves to some attractor. This evolution is determined by the synchronous discrete time dynamics. At each time step: X i (t + 1) = Θ(h i (t) − T (t)), i = 1, · · ·, N, X i (0) = X in i (9) where h
i are components of the vector of synaptic excitations h(t) = JX(t), (10)
Θ is the step function, and T (t) is an activation threshold. At each time step of the recall process the threshold T (t) was chosen in such a way that the level of the network activity was kept constant and equal to k in . Thus, on each time step k in “winners" (neurons with the greatest synaptic excitation) were chosen and only they were active on the next time step. As shown in [12], this choice of ac- tivation threshold enables the network activity to stabilize in point or cyclic attractors Pattern Discovery for High-Dimensional Binary Datasets 867
of length two. The fixed level of activity at this stage of the recall process could be ensured by biologically plausible non-linear negative feed-back control accomplished by the inhibitory interneurons. It is worth to note that although the fact of convergence of network synchronous dynamics to point or cyclic attractors of length two was estab- lished earlier [14], it was done for fixed activation threshold T but for fixed network activity it was done first in [12]. When activity stabilizes at the initial level of activity k in , k in + 1
neurons with maximal synaptic excitation are chosen for the next iteration step, and network activity evolves to an attractor at the new level of activity k in + 1 . The level of activity then increases to k in + 2
, and so on, until the number of active neurons reaches the final level k
f = r
f N where r = k/N is a relative network activity. Thus, one trial of the recall procedure contains (r f − r in )N external steps and several internal steps (usually 2-3) inside each external step to reach an attractor for a given level of activity. At the end of each external step when network activity stabilizes at the level of k active neurons a Lyapunov function was calculated by formula: λ = X
T (t + 1)JX(t)/k, (11) where X
T (t+1)
and X(t) are two network states in a cyclic attractor (for point attractor X T (t + 1) = X(t) ). The identification of factors along the trajectories of the network dynamics was based on the analysis of the change of the Lyapunov function and the activation threshold along each trajectory. In our definition of Lyapunov function its value gives a mean synaptic excitation of neurons belonging to an attractor at the end of each external step. 2.5 Statistical Clustering Methods The clustering methods help to reveal groups of black pixels, it means typical parts of images. However, obtaining disjunctive clusters is a problem of the usage of traditional hard clustering. The HAA algorithm starts with each pixel in a group of its own. Then it merges clusters until only one large cluster remains which includes all pixels. The user must choose dissimilarity or similarity measure and agglomerative procedure. At the begin- ning, when each pixel represents its own cluster, the dissimilarity between two pixels is defined by the chosen dissimilarity measure. However, once several pixels have been linked together, we need a linkage or amalgamation rule to determine when two clusters are sufficiently similar to be linked together. Several linkage rules have been proposed, i.e. the distance between two different clusters can be determined by the greatest dis- tance between two pixels in the clusters (Complete Linkage method - CL), or average distance between all pairs of objects in the two clusters (Average Linkage Between Groups - ALBG). Hierarchical clustering is based on the proximity matrix (dissimilari- ties for all pairs of pixels) and it is independent on the order of pixels. For binary data, we can choose for example Jaccard and Ochiai (cosine) similarity measures of two pix- els. The former can be expressed as S J =
a+b+c where a is the number of the common 868 V. Snášel et al. occurrences of ones and b + c is the number of pairs in which one value is one and the second is zero. The latter can be expressed as S O =
a+b · a a+c . In the TSCA algorithm, the pixels are arranged into subclusters, known as "cluster- features". These cluster-features are then clustered into k groups, using a traditional hierarchical clustering procedure. A cluster feature (CF) represents a set of summary statistics on a subset of the data. The algorithm consists of two phases. In the first one, an initial CF tree is built (a multi-level compression of the data that tries to preserve the inherent clustering structure of the data). In the second one, an arbitrary clustering algorithm is used to cluster the leaf nodes of the CF tree. Disadvantage of this method is its sensitivity to the order of the objects (pixels in our case). The log-likelihood distance measure can be used for calculation of the distance between two clusters. 3 Experimental Results For testing of above mentioned methods, we used generic collection of 1600 32 × 32 black-and-white images containing different combinations of horizontal and vertical lines (bars). The probabilities of bars to occur in images were the same and equal to 10/64
, i.e. images contain 10 bars in average. An example of several images from generated collection is shown in Figure 3a. For the first view on image structures, we applied traditional cluster analysis. We clustered 1024 (32 × 32) positions into 64 and 32 clusters. The problem of the use of traditional cluster analysis consists in that we can obtain disjunctive clusters which means we can find only horizontal bars and parts of vertical bars and vice versa. We applied HAA and TSCA clustering as implemented in the SPSS system. The problem of TSCA method consists in that it is dependent on the order of analyzed images, so we used two different orders. For HAA, we tried to use different similarity measures. We found that linkage meth- ods have more influence to the results of clustering than similarity measures. We used Jaccard and Ochiai (cosine) similarity measures suitable for asymmetric binary at- tributes. We found both as suitable methods for the identification of the bars or their parts. For 64 clusters, the differences were only in a few assignments of positions by ALBG and CL methods with Jaccard and Ochiai measures. The following figures illustrate the application of some of these techniques for 64 and 32 clusters. Figure 2a,b show results of ALBG method with Jaccard measure Figure 2a for 32 clusters and Figure 2b for 64 clusters. In the case of 32 clusters, we found 32 horizontal bars (see Figure 2c ) by TSCA method for the second order of features. Many of tested methods were able to generate a set of base images or factors, which should ideally record all possible bar positions. However, not all methods were truly successful in this. With the SVD, we obtain classic singular vectors, the most general being among the first. The first few are shown in Figure 3b. We can see, that the bars are not separated and different shades of gray appear. The NMF methods yield different results. The original NMF method, based on the adjustment of random matrices W and H, provides hardly-recognizable images even for k = 100 and 1000 iterations (we used 100 iterations for other experiments).
Pattern Discovery for High-Dimensional Binary Datasets 869
(a) (b)
(c) Fig. 2. (a) 64 and (b) 32 clusters of pixels by ALBG method (Jaccard coefficient) (c) 32 clusters of pixels by TSCA method (a) (b)
(c) Fig. 3. (a) Several bars from generated collection (b) First 64 base images of bars for SVD method (c) First 64 factors for the original NMF method Moreover, these base images still contain significant salt and pepper noise and have a bad contrast. The factors are shown in Figure 3c. We must also note, that the NMF decomposition will yield slightly different results each time it is run, because the ma- trix(es) are pre-filled with random values. The GD-CLS modification of NMF method tries to improve the decomposition by calculating the constrained least squares problem. This leads to a better overall quality, however, the decomposition really depends on the pre-filled random matrix H. The result is shown in Figure 4a. The SDD method differs slightly from previous methods, since each factor contains only values {−1, 0, 1}. Gray in the factors shown in Figure 4b represents 0; −1 and 1 are represented with black and white respectively. The base vectors in Figure 4b can be divided into three categories: Base vectors containing only one bar. Base vectors containing one horizontal and one vertical bar. Other base vectors, containing several bars and in some cases even noise. Finally, we made decomposition of images into binary vectors by the NBFA method. Here factors contain only values {0, 1} and we use Boolean arithmetics. The factor search was performed under assumption that the number of ones in factor is not less
870 V. Snášel et al. (a) (b)
(c) Fig. 4. First (a) 64 factors for GD-CLS NMF method, first 64 base vectors for (b) SDD method and (c) NBFA method than 5 and not greater than 200. Since the images are obtained by Boolean summation of binary bars, it is not surprising that the NBFA is able to reconstruct all bars as base vectors, providing an ideal solution, as we can see in Figure 4. It is clear that classical linear methods could not take into account non-linearity of Boolean summation and therefore are inadequate for bars problem task. But the linear methods are fast and well elaborated so it was very interesting to compare linear approach with the NBFA and compare qualitatively the results.
In this paper, we have compared NBFA and several dimension reduction attempts - icluding clustering methods - on so called bars problem. It is shown that NBFA perfectly found basis (factors) from which the all learning pictures can be reconstructed. First, it is because the model, on which the BFA is based, is the same as used for data generation. Secondly, because of robustness of BFA im- plementation based on recurrent neural network. Some experiments show, that the re- sistance against noise is very high. We hypothesize that it is due the self reconstruction ability of our neural network. Whilst the SVD is known to provide quality eigenfaces, it is computationally expen- sive and in case we only need to beat the "curse of dimensionality" by reducing the dimension, SDD may suffice. As expected, from methods which allowed direct query- ing in reduced space, SVD was the slower, but most exact method. The NMF and SDD methods may be also used, but not with the L 2 metric, since the distances are not preserved enough in this case. There are some other newly-proposed methods, which may be interesting for fu- ture testing, e.g. the SparseMap [15]. Additionally, faster pivot selection technique for FastMap [16] may be considered. Finally, testing of used dimension reduction methods with deviation metrics on metric structures should answer the question of projected data indexability (which is poor for SVD-reduced data). Pattern Discovery for High-Dimensional Binary Datasets 871
Cluster analysis in this application is focused on finding original factors from which images were generated. Applied clustering methods were quite successful in finding these factors. The problem of cluster analysis is, that it provides disjunctive clusters only. So, only some bars or their parts were revealed. However, from the general view on 64 clusters, it is obvious, that images are compounded from vertical and horizontal bars (lines). By two-step cluster analysis, 32 horizontal lines were revealed by clustering to 32 clusters.
1M6840070004 and partly by the Institutional Research Plan AV0Z10300504 "Com- puter Science for the Information Society: Models, Algorithms, Appplications" and by the project 1ET100300419 of the Program Information Society of the Thematic Pro- gram II of the National Research Program of the Czech Republic.
1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: VLDB 1994: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc., San Francisco (1994) 2. Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: SIGMOD 1997: Proceedings of the 1997 ACM SIGMOD international conference on Management of data, pp. 255–264. ACM Press, New York (1997) 3. Spellman, P.T., Sherlock, G., Zhang, M.Q., Anders, V.I.K., Eisen, M.B., Brown, P., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle-regulated genes of the yeast sac- charomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9, 3273– 3297 (1998) 4. Koyutürk, M., Grama, A., Ramakrishnan, N.: Nonorthogonal decomposition of binary ma- trices for bounded-error data compression and analysis. ACM Trans. Math. Softw. 32(1), 33–69 (2006) 5. Földiák., P.: Forming sparse representations by local anti-Hebbian learning. Biological cy- bernetics 64(22), 165–170 (1990) 6. Frolov, A., Húsek, D., Polyakov, P., ˇ Rezanková, H.: New Neural Network Based Approach Helps to Discover Hidden Russian Parliament Votting Paterns. In: International Joint Con- ference on Neural Networks, Omnipress, pp. 6518–6523 (2006) 7. Frolov, A.A., Húsek, D., Muravjev, P., Polyakov, P.: Boolean Factor Analysis by Attractor Neural Network. Neural Networks, IEEE Transactions 18(3), 698–707 (2007) 8. Berry, M., Dumais, S., Letsche, T.: Computational Methods for Intelligent Information Ac- cess. In: Proceedings of the 1995 ACM/IEEE Supercomputing Conference, San Diego, Cal- ifornia, USA (1995) 9. Kolda, T.G., O’Leary, D.P.: Computation and uses of the semidiscrete matrix decomposition. In: ACM Transactions on Information Processing (2000) 10. Shahnaz, F., Berry, M., Pauca, P., Plemmons, R.: Document clustering using nonnegative ma- trix factorization. Journal on Information Processing and Management 42, 373–386 (2006) 11. Spratling, M.W.: Learning Image Components for Object Recognition. Journal of Machine Learning Research 7, 793–815 (2006) 12. Frolov, A.A., Húsek, D., Muravjev, P.: Informational efficiency of sparsely encoded Hopfield-like autoassociative memory. Optical Memory and Neural Networks (Information Optics), 177–198 (2003) 872 V. Snášel et al. 13. Frolov, A.A., Sirota, A.M., Húsek, D., Muravjev, P.: Binary factorization in Hopfield-like neural networks: single-step approximation and computer simulations. Neural Networks World, 139–152 (2004) 14. Goles-Chacc, E., Fogelman-Soulie, F.: Decreasing energy functions as a tool for studying threshold networks. Discrete Mathematics, 261–277 (1985) 15. Faloutsos, C.: Gray Codes for Partial Match and Range Queries. IEEE Transactions on Soft- ware Engineering 14(10) (1988) 16. Faloutsos, C., Lin, K.: FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualiza- tion of Traditional and Multimedia Datasets. ACM SIGMOD Record 24(2), 163–174 (1995)
Expand-and-Reduce Algorithm of Particle Swarm Optimization Eiji Miyagawa and Toshimichi Saito Hosei University, Koganei, Tokyo, 184-0002 Japan Abstract. This paper presents an optimization algorithm: particle swarm optimization with expand-and-reduce ability. When particles are trapped into a local optimal solution, a new particle is added and the trapped particle(s) can escape from the trap. The deletion of the particle is also used in order to suppress excessive network grows. The algorithm efficiency is verified through basic numerical experiments. 1 Introduction Particle swarm optimization (PSO) has been used extensively as an effective technique for solving a variety of optimization problems [1] - [6]. It goes with- out saying that it is hard to find a real optimal solution in practical complex problems. The PSO algorithm tries to find the solution with enough accuracy in practical cases. The original PSO was developed by Kennedy and Eberhart [1]. The PSO shares best information in the swarm and is used for the function op- timization problem. In order to realize effective PSO search, there exist several interesting approaches including particle dynamics by random sampling from a flexible distribution [3], an improved PSO by speciation [4], growing multiple subswarms [5] and adaptive PSO [6]. The prospective applications of the PSO are many, including iterated prisoner’s dilemma, optimizing RNA secondary struc- ture, mobile sensor networks and nonlinear State estimation [7] - [10]. However, the basic PSO may not find a desired solution in complex problems. For example, if one particle find a local optimal solution, the entire swarm may be trapped into it. Once the swam is trapped, it is difficult to escape from the trap. This paper presents a novel version of the PSO: PSO with expand-and-reduce ability (ERSPO). When particles are trapped into a local optimal solution, a new particle is added and the particle swarm can grow. In a swarm, particles share information of added particles. The trapped particle(s) can escape from the trap provided parameters are selected suitably. The deletion of particle(s) is also introduced in order to suppress excessive swarm grows. If the deletion does not present, the swarm goes excessively and computation overload is caused. Performing basic numerical experiments, effectiveness of the ERSPO algorithm is verified. It should be noted that the growing and/or reducing of swarms have been key technique in several learning algorithms including self-organizing maps and practical applications [11]. M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 873–881, 2008. c Springer-Verlag Berlin Heidelberg 2008
874 E. Miyagawa and T. Saito 2 Fundamental Version of PSO As preparation, we introduce the global best version of the PSO with an elemen- tal numerical experiment. The PSO is an optimization algorithm where particles within the swarm learn from each other, and move to become more similar to their “better” neighbors. The social structure for the PSO is determined through the formation of neighborhood to communicate. The global best (gbest) version is fundamental where each particle can communicate with every other particle, forming a fully connected social network. The i-th particle at time t is charac- terized by its position x i (t) and the position is updated based on its velocity v i (t)”. We summarize the gbest version for finding global minimum of a function F where dimension of F and x i are the same: Step 1: Initialization. Let t = 0. The particle position x i (t) is located randomly in the swarm S(t) where i = 1 ∼ N and N is the number of particles. Step 2: Compare the value of F of each particle to its best value (pbest i ) so far: if F (x i (t)) < pbest i then (a) pbest i = F (x
i (t)) and (b) x pbest i
i (t).
Step 3: Compare the value of F of each particle to the global best value: if F (x
i (t)) < gbest then (a) gbest = F (x i (t)) and (b) x gbest = x
i (t).
Fig. 1. Search process by the gbest version. The dot denotes a particle. N = 10 and t max = 50. ρ 1 = ρ 2 is uniform random number on [0, 2]. Expand-and-Reduce Algorithm of Particle Swarm Optimization 875
Step 4: Change the velocity vector for each particle: v i (t) = W v i (t) + ρ 1 (x pbest i − x
i (t)) + ρ
2 (x gbest − x i (t)) (1) where W is the inertial term defined by W = W max
− W max −W min
t max
× t. (2)
ρ 1 and ρ 2 are random variables. Step 5: Move each particle to a new position: x i (t) = x i (t) + v
i (t).
Step 6: Let t = t + 1. Go to Step 2, and repeat until t = t max
. In order to demonstrate performance of the PSO, we apply the gbest algorithm to the following elemental problem: minimize
f 1 (x 1 , x
2 ) = x
2 1 + x 2 2 subject to |x 1 | ≤ 50, |x 2 | ≤ 50
(3) f 1 has the unique extremum at the origin (x 1 , x 2 ) = (0, 0) that is the optimal solution. Fig. 1 illustrates learning process of particles in the gbeat version. We can see that the entire swarm converge to the optimal solution. New algorithm is an improvement of this algorithm: the network has ring structure, the number of swarm is time-variant, and the global best value is replaced with the local best value within the closest neighbors. The detailed definition is given in Section 3. 3 Expand-and-Reduce Algorithm The fundamental PSO can find the optimal solution in simple problems as shown in the previous section. However, if the object functions has complex shape with many local extrema, entire swarm may be trapped into a local optimal solution. The velocity of each particle is changed by the random variables ρ 1 and ρ
2 , however, it is very hard to adjust the parameter depending on various problems. Here we present a novel type of PSO where the swarm can be expanded and reduced. In this expand-and-reduce particle swarm optimization (ERPSO), the particle can escape from the trap without adjusting the parameter. In the ERPSO, the i-th particle at time t is characterized by its position x i (t) and
counter c i (t) that counts time-invariance of pbest i . The closest neighbors in the ring structure is used to update particles and the number of particles N is a variable depending on the value of c i (t). We define the ERPSO for the swarm with ring structure and for finding problem of global minimum of a function F . Step 1: Initialization. Let t = 0 and the number of particles N and the value of the counter c ( t) is initialized. The position x i (t) is located randomly in the swarm.
876 E. Miyagawa and T. Saito Fig. 2. Expansion of PSO and escape from a trap Step 2: Compare the cost of each particle with its best cost so far: if F (x i (t)) <
pbest i then (a) pbest i = F (x
i (t)) and (b) x pbest i
i (t) The counter value increases if improvement of pbest i is not sufficient: c i (t) = c i (t) + 1
if |F (x
i (t))
− pbest i | < (4) where
is a small value. Step 3 (expansion): If a particle is trapped into local optimal or global optimal as shown in Fig. 2 (a) and (b), the number of c i is large. If the counter value exceeds a threshold, a new particle is inserted into the closest neighbor of the trapped particle and is located away from the trap as shown in Fig. 2 (c) and (d). In the case where the i-th particle is trapped, x new (t) = x i (t) + r if c i (t) > T int . (5) where r is a random variable and the suffix is reassigned: j = j + 1 for i < j and x new = x i+1
. Let N = N + 1. Let the counter be initialized, c i (t) = 0 for all i. This expansion can be effective to escape from the trap. Step 4 (reduction): If the value of gbest(t) does not change during some time interval started from time when a new particle is inserted, one of trapped par- ticles is removed. The reduction aims at suppression of excessive growing that causes unnecessary computation time.
Expand-and-Reduce Algorithm of Particle Swarm Optimization 877
Step 5: Compare the cost of each particle with its local best lbset cost so far: if F (x
i (t)) < lbest i then (a) lbest i = F (x
i (t)) and (b) x lbest i
i (t) where the lbest is given for the both closest neighbors of the i-th particle. Step 6: Change the velocity vector for each particle: v i
i (t) + ρ
1 (x pbest i − x
i (t)) + ρ
2 (x lbest i − x
i (t))
(6) where W is the inertial term defined by Eq. 2 and ρ 1 and ρ
2 are random variables. f f Fig. 3. Searching the minimum value of Equation (3). The red line is the solution. (a) lbest version without expand-and-reduce for N = 7. (b) ERPSO, N (0) = 3. The parameter values: T int = 30, W
max = 0.9, W
min = 0.4, t
max = 750,
= 10 −3 . The random variables ρ 1 = ρ 2 are given by uniform distribution on [0, 2]. The time average of N (t) is 7.
878 E. Miyagawa and T. Saito Step 7: Move each particle to a new position: x i (t) = x i (t) + v
i (t).
Step 8: Let t = t + 1. Go to Step 2, and repeat until t = t max
. 4 Numerical Experiments In order to investigate efficiency of the ERPSO, we have performed basic nu- merical experiments. First, we have applied the ERPSO to the problem defined by Equation (3). The result is shown in Fig. 3 where the lbest version is defined by Step 1 to Step 7 without Steps 3 and 4. This result suggest that the ERPSO can find the optimal solution a little bit faster than the lbest version. The efficiency of ERPSO seems to be remarkable in more complicated problem.
Fig. 4. Expand-and-reduce Search process in f 2 (x
, x 2 ). Parameter values are defined in Fig. 5 caption. Expand-and-Reduce Algorithm of Particle Swarm Optimization 879
f f Fig. 5. Searching the minimum value of Equation (7). The red line is the solution. (a) is lbest version without expand-and-reduce, N = 9 that is the time average of N (t) in the ERPSO. (b) is ERPSO, N (0) = 3. The parameter values: T int = 30, W
max = 0.9, W min = 0.4, t
max = 750,
= 10 −3 . The random variables ρ 1 = ρ
2 is used uniform distribution on [0, 2]. Second, we have applied the ERPSO to a multi-extrema function: the modified Shekel’s Foxholes function studied in [12]. minimize
f 2 (x 1 , x
2 ) =
− 30 i=1 1 (x 1 − α i ) 2 + (x
2 − β
i ) 2 + γ i subject to 0 ≤ x
1 ≤ 10, 0 ≤ x 2 ≤ 10
(7) 880 E. Miyagawa and T. Saito Table 1. Success rate for 1,000 trials. Parameters are as in Fig. 5. Function Algorithm Success rate f 1
100 % ERPSO
100 % f 2 PSO 37.5 %
ERPSO 56.2 %
(α 1 , · · · , α 10 ) = (9.681, 9.400, 8.025, 2.196, 8.074, 7.650, 1.256, 8.314, 0.226, 7.305) (α 11 , · · · , α 20 ) = (0.652, 2.699, 8.327, 2.132, 4.707, 8.304, 8.632, 4.887, 2.440, 6.306) (α 21 , · · · , α 30 ) = (0.652, 5.558, 3.352, 8.798, 1.460, 0.432, 0.679, 4.263, 9.496, 4.138) (β 1 , · · · , β 10 ) = (0.667, 2.041, 9.152, 0.415, 8.777, 5.658, 3.605, 2.261, 8.858, 2.228) (β 11 , · · · , β 20 ) = (7.027, 3.516, 3.897, 7.006, 5.579, 7.559, 4.409, 9.112, 6.686, 8.583) (β 21 , · · · , β 30 ) = (2.343, 1.272, 7.549, 0.880, 8.057, 8.645, 2.800, 1.074, 4.830, 2.562) (γ 1 , · · · , γ 10 ) = (0.806, 0.517, 0.100, 0.908, 0.965, 0.669, 0.524, 0.902, 0.531, 0.876) (γ 11 , · · · , γ 20 ) = (0.462, 0.491, 0.463, 0.714, 0.352, 0.869, 0.813, 0.811, 0.828, 0.964) (γ 21 , · · · , γ 30 ) = (0.789, 0.360, 0.369, 0.992, 0.332, 0.817, 0.632, 0.883, 0.608, 0.326) This function has a lot of local optimal solutions as shown in Fig. 4(a). The op- timal solution is f 2 (x
, x 2 ) = −12.12 at (x 1 , x 2 ) = (8.02, 9.14). A search process is illustrated in Fig. 5: we can see that particles can escape from the trap and goes to the optimal solution. We have confirmed that this optimal solution is hard to search by the fundamental gbast version. Fig. 5 and Table. 1 shows the result of ERPSO with that of the lbest version. In the results, the lbest version is hard to find the solution but the ERPSO can find it. In the lbest version, the particles are trapped into some local minimum almost always. 5 Conclusions A novel version of PSO is presented in this paper. In the algorithm, the expanding function is effective for escape from a trap and the reducing function is effective to suppress computation cost. The algorithm efficiency is suggested through basic numerical experiment. This paper is a first step and has many problems including the following: 1) analysis of role of parameters, 2) analysis of effect of topology of particle network, 3) automatic adjustment of parameters, 4) application to more complicated benchmarks, and 5) application to practical problems. References 1. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: Proc of IEEE/ICNN, pp. 1942–1948 (1995) 2. Engelbrecht, A.P.: Computational Intelligence, an introduction, pp. 185–198. Wiley, Chichester (2004)
Expand-and-Reduce Algorithm of Particle Swarm Optimization 881
3. Richer, T.J., Blackwell, T.M.: The Levy Particle Swarm. In: Proc. Congr. Evol. Comput., pp. 3150–3157 (2006) 4. Parrott, D., Li, X.: Locating and tracking multiple dynamic optima by a particle swarm model using speciation. IEEE Trans. Evol. Comput. 10(4), 440–458 (2006) 5. Brits, R., Engelbrecht, A.P., van den Bergh, F.: A Niching Particle Swarm Opti- mizer. In: Proc. of SEAL, vol. 1079 (2002) 6. Hu, X., Eberhart, R.C.: Adaptive Particle Swarm Optimization: Detection and Response to Dynamic Systems. In: Proc. of IEEE/CEC, pp. 1666–1670 (2002) 7. Franken, N., Engelbrecht, A.P.: Particle swarm optimization approaches to co- evolve strategies for the Iterated Prisoner’s Dilemma. IEEE Trans. Evol. Com- put. 9(6), 562–579 (2005) 8. Neethling, M., Engelbrecht, A.P.: Determining RNA secondary structure using set- based particle swarm optimization. In: Proc. Congr. Evol. Comput., pp. 6134–6141 (2006)
9. Jatmiko, W., Sekiyama, K., Fukuda, T.: A PSO-based mobile sensor network for odor source localization in dynamic environment: theory, simulation and measure- ment. In: Proc. Congr. Evol. Comput., pp. 3781–3788 (2006) 10. Tong, G., Fang, Z., Xu, X.: A particle swarm optimized particle filter for nonlinear system state estimation. In: Proc. Congr. Evol. Comput., pp. 1545–1549 (2006) 11. Oshime, T., Saito, T., Torikai, H.: ART-based parallel learning of growing SOMs and its application to TSP. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.) ICONIP 2006. LNCS, vol. 4232, pp. 1004–1011. Springer, Heidelberg (2006) 12. Bersini, H., Dorigo, M., Langerman, S., Geront, G., Gambardella, L.: Results of the first international contest on evolutionary optimisation (1st iceo). In: Proc. of IEEE/ICEC, pp. 611–615 (1996)
M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 882–891, 2008. © Springer-Verlag Berlin Heidelberg 2008 Download 12.42 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling