Lecture Notes in Computer Science
Download 12.42 Mb. Pdf ko'rish
|
Features
F re que nc ie s Fig. 5. Distribution of features for the Majority data set as selected by TFS relevant ones) are added. Ten different data samples of 1000 examples each are gen- erated with n = 21. The truth about this data set is: 8 relevant features (1-8), 8 irrelevant (9-16) and 5 redundant (17-21). We were interested in analyz- ing the frequency distribution of the features selected by TFS according to their type. The results are shown in figure 5: it is remarkable that TFS gives priority to all the relevant features and rejects all the irrelevant ones. Redundant features are sometimes allowed, compensating for the absence of some relevant ones. Av- erage performance as given by 1NN is 0.95. This figure should be compared to the performance of 0.77 obtained with the full feature set. 6 Conclusions An algorithm for feature selection based on simulated annealing has been in- troduced. A notable characteristic over other search algorithms for this task is its capability to accept momentarily worse solutions. The algorithm has been evaluated against the Sequential Forward Floating Search (SFFS), one of the most reliable algorithms for moderate-sized problems. In comparative results with SFFS (and using a number of inducers as wrappers) the feature selection process shows remarkable results, superior in all cases to the full feature set and
692 F.F. Gonz´ alez and L.A. Belanche substantially better than those achieved by SFFS alone. The proposed algorithm finds higher-evaluating solutions, both when their size is bigger or smaller than those found by SFFS and offers a solid and reliable framework for feature subset selection tasks. As future work, we plan to use the concept of thermostatistical persistency [13] to improve the algorithm while reducing computational cost. References 1. Yang, J., Honavar, V.: Feature Subset Selection Using a Genetic Algorithm. In: Motoda, H., Liu, H. (eds.) Feature Extraction, Construction, and Subset Selection: A Data Mining Perspective, Kluwer, New York (1998) 2. Schlapbach, A., Kilchherr, V., Bunke, H.: Improving Writer Identification by Means of Feature Selection and Extraction. In: 8th Int. Conf. on Document Analysis and Recognition, pp. 131–135 (2005) 3. Debuse, J.C., Rayward-Smith, V.: Feature Subset Selection within a Simulated Annealing DataMining Algorithm. J. of Intel. Inform. Systems 9, 57–81 (1997) 4. Pudil, P., Ferri, F., Novovicova, J., Kittler, J.: Floating search methods for feature selection. Pattern Recognition Letters 15(11), 1119–1125 (1994) 5. Jain, A., Zongker, D.: Feature selection: Evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997) 6. Kudo, M., Somol, P., Pudil, P., Shimbo, M., Sklansky, J.: Comparison of classifier- specific feature selection algorithms. In: Procs. of the Joint IAPR Intl. Workshop on Advances in Pattern Recognition, pp. 677–686 (2000) 7. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Equations of state calculations by fast computing machines. J. of Chem. Phys. 21 (1953) 8. Kirkpatrick, S.: Optimization by simulated annealing: Quantitative studies. Jour- nal of Statistical Physics 34 (1984) 9. Bishop, C.: Neural networks for pattern recognition. Oxford Press, Oxford (1996) 10. Reeves, C.R.: Modern Heuristic Techniques for Combinatorial Problems. McGraw Hill, New York (1995) 11. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelli- gence 97(1-2), 273–324 (1997) 12. Duda, R.O., Hart, P., Stork, G.: Pattern Classification. Wiley & Sons, Chichester (2001) 13. Chardaire, P., Lutton, J.L., Sutter, A.: Thermostatistical persistency: A power- ful improving concept for simulated annealing algorithms. European Journal of Operational Research 86(3), 565–579 (1995) Solvable Performances of Optimization Neural Networks with Chaotic Noise and Stochastic Noise with Negative Autocorrelation Mikio Hasegawa 1 and Ken Umeno 2,3 1 Tokyo University of Science, Tokyo 102-0073, Japan 2 ChaosWare Inc., Koganei-shi, 183-8795, Japan 3 NICT, Koganei-shi, 183-8795, Japan Abstract. By adding chaotic sequences to a neural network that solves combinatorial optimization problems, its performance improves much better than the case that random number sequences are added. It was already shown in a previous study that a specific autocorrelation of the chaotic noise makes a positive effect on its high performance. Autocorre- lation of such an effective chaotic noise takes a negative value at lag 1, and decreases with dumped oscillation as the lag increases. In this paper, we generate a stochastic noise whose autocorrelation is C(τ ) ≈ C × (−r) τ , similar to effective chaotic noise, and evaluate the performance of the neural network with such stochastic noise. First, we show that an ap- propriate amplitude value of the additive noise changes depending on the negative autocorrelation parameter r. We also show that the per- formance with negative autocorrelation noise is better than those with the white Gaussian noise and positive autocorrelation noise, and almost the same as that of the chaotic noise. Based on such results, it can be considered that high solvable performance of the additive chaotic noise is due to its negative autocorrelation. 1 Introduction Chaotic dynamics have been shown to be effective for combinatorial optimization problems in many researches [1,2,3,4,5,6,7,8,9]. In one of those approaches that adds chaotic sequences to each neuron in a mutually-connected neural network solving an optimization problem to avoid local minimum problem, it has been clearly shown that the chaotic noise is more effective than random noise to improve the performance [8,9]. For realizing higher performance using the chaotic dynamics, it was also utilized in optimization methods with heuristic algorithms which are applicable to large-scale problems [3], and this approach is shown to be more effective than tabu search and simulated annealing [5,6]. In experimental analyses of the chaotic search, several factors enabling high performance were found. One is that the chaotic dynamics close to the edge of chaos has high performance [7]. The second is that a specific autocorrelation function of the chaotic dynamics has a positive effect [9], in the approach that adds the chaotic sequences as additive noise to mutually-connected neural network. M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 693–702, 2008. c Springer-Verlag Berlin Heidelberg 2008 694 M. Hasegawa and K. Umeno As another application of the chaos dynamics, it has been applied to CDMA communication system [10,11,12]. For the spreading code, cross correlation be- tween the code sequences should be as small as possible to reduce bit error rate caused by mutual interference. In the chaotic CDMA approach, it has been shown that chaotic sequences, which have a negative autocorrelation at lag 1, make the cross correlation smaller [11]. For generating such an optimum code sequence whose autocorrelation is C(τ ) ≈ C × (−r) τ , a FIR filter has also been proposed, and the advantages of such code sequences has been experimentally shown [12]. In this paper, we analyze of effects of the chaotic noise to combinatorial op- timization by introducing the stochastic noise whose autocorrelation is C(τ ) ≈ C
τ , same as the sequence utilized in the chaotic CDMA [12], because the effective chaotic noise shown in Ref. [9] has a similar autocorrelation with them. We evaluate their performance with changing the variance and parameters of chaotic and stochastic noise, and investigate a positive factor of chaotic noise to combinatorial optimization problems. 2 Optimization by Neural Networks with Chaotic Noise We introduce the Traveling Salesman Problems (TSPs) and the Quadratic As- signment Problems (QAPs) to evaluate the performance of each type of noise. In this paper, the Hopfield-Tank neural network [13] is a base of the solution search. It is well-known that the energy function, E(t) = −
2 n i=1 n j=1 k=1 l=1 w ikjl
x ik (t)x jl (t) +
n i=1
n k=1
θ ik x ik (t)
(1) always decreasing when the neurons are updated by the following equation, x ik
n j=1
n l=1
w ikjl
x jl (t) + θ ik ], (2) where x ik (t) is the output of the (i, k)th neuron at time t, w ikjl is the connection weight between the (i, k) th and (j, l) th neurons, and θ ik is the threshold of the (i, k) th neuron. Because the original Hopfield-Tank neural networks stop searching at a local minimum, the chaotic noise and other random dynamics has been added to the neurons to avoid trapping at such undesirable states and to achieve much higher performance. As already been applied in many conventional researches, the energy function for solving the TSPs [13,1,2,8,9] can be defined by the following equation, E tsp
= A[ { N i=1 ( N k=1 x ik (t) − 1)
2 } + {
N k=1
( N i=1 x ik (t) − 1) 2 }] + B N i=1 N j=1
N k=1
d ij x ik (t)
{x jk+1
(t) + x jk −1 (t) }, (3) Solvable Performances of Optimization Neural Networks 695
0 20 40 60 80 100 0.00 0.10
0.20 0.30
0.40 0.50
Solvable Performance (%) Noise Amplitude β a=3.82
a=3.92 a=3.95
Gausian Noise Fig. 1. Solvable performance of chaotic noise and white Gaussian noise on a 20-city TSP where N is the number of cities, d ij is the distance between the cities i and j, and A and B are the weight of the constraint term (formation of a closed tour) and the objective term (minimization of total tour length), respectively. From Eqs. (1) and (3), the connection weights w ijkl
and the threshold θ ijkl
can be obtained as follows, w ijkl
= −A{δ
ij (1 − δ kl ) + δ
kl (1 − δij)} − Bd ij (δ lk+1 + δ l −k−1 ), (4)
θ ij = 2A, (5) where δ
ij = 1 if i = j, δ ij = 0 otherwise. For the QAPs whose objective function is F (p) =
N i=1
N j=1
a ij b p(i)p(j) , (6) we use the following energy function, E qap = A[ { N i=1 ( N k=1 x ik (t) − 1)
2 } + {
N k=1
( N i=1 x ik (t) − 1) 2 }] + B N i=1 N j=1 N k=1
N l=1
a ij b kl x ik (t)x jl (t). (7) From Eqs. (1) and (7), the connection weight and the threshold for the QAPs are obtained as follows, w ijkl = −A{δ
ij (1 − δ kl ) + δ
kl (1 − δij)} − Ba ij b kl , (8)
θ ij = 2A. (9) 696 M. Hasegawa and K. Umeno 0 5
15 20 25 0.04 0.06
0.08 0.10
0.12 0.14
0.16 Solvable Performance (%) Noise Amplitude β a=3.82 a=3.92 a=3.95
Gausian Noise Fig. 2. Solvable performance of chaotic noise and white Gaussian noise on a QAP with 12 nodes (Nug12) In order to introduce additive noise into the update equation of each neuron, we use the following equation, x ik (t + 1) = f [ N j=1 N l=1
w ikjl
x jl (t) + θ ik + βz
ik (t)],
(10) where z
ij (t) is a noise sequence added to the (i, j)th neuron, β is the amplitude of noise, and f is the sigmoidal output function, f (y) = 1/(1 + exp( −y/ )).
The noise sequence introduced as z ij (t) is normalized to be zero mean and unit variance. In Figs. 1 and 2, we compare the solvable performances of the above neural networks with the logistic map chaos, z c ij (t + 1) = az c ij (t)(1 − z
c ij (t)), and white Gaussian noise for additive noise sequence z ij (t). For the chaotic noise, we used 3.82, 3.92, and 3.95 for the parameter a. The abscissa axis is the amplitude of the noise, β in Eq. (10). The solvable performance on the ordinate is defined as the percentage of successful runs obtaining the optimum solution in 1000 runs with different initial conditions. The successful run obtaining of the optimum solution is defined as hitting the optimum solution state at least once in a fixed iteration. For the problems introduced in this paper, exactly optimum solution is known for each, by an algorithm which guarantees exactness of the solution but requires much larger amount of computational time. In this paper, the solv- able performances of each type of noise are evaluated using a small and fixed computational time. We set the cutoff time longer for the QAP because it is more difficult thatn the TSP and includes it as a sub-problem. The cutoff times for each run are set to 1024 iterations for TSP and to 4096 iterations for QAP, respectively. The parameters of the neural network are A = 1, B = 1, = 0.3
for the TSP, and A = 0.35, B = 0.2, = 0.075 for the QAP, respectively. The problems introduced in this paper are a 20-city TSP in [9] and a QAP with 12 nodes, Nug12 in QAPLIB [14]. Solvable Performances of Optimization Neural Networks 697
0.0 0.2
0.4 0.6
0.8 1.0
3.50 3.60
3.70 3.80
3.90 4.00
z (
1)
(
)(1−
(
)) Logistic Map Parameter a 0 20 40 60 80 100 3.50
3.60 3.70
3.80 3.90
4.00 Solvable Performance (%) Logistic Map Parameter
β =0.325 0 20 40 60 80 100 3.50 3.60
3.70 3.80
3.90 4.00
Solvable Performance (%) Logistic Map Parameter a β =0.375 0 20 40 60 80 100 3.50 3.60
3.70 3.80
3.90 4.00
Solvable Performance (%) Logistic Map Parameter a β =0.425 Fig. 3. Solvable performance of logistic map chaos with different noise amplitude β on a 20-city TSP The results in both Figs. 1 and 2 show that the neural network with the chaotic noise performs better than that with the white Gaussian noise, on a comparison of the best solvable performance of each noise. From the results, it also can be seen that the best value of noise amplitude for the highest solvable performance is different among the noise sequences. This is not due to difference in the variances of the original noise sequences, because each sequence is normalized before being added to neurons as z ik (t) as described above. Figure 3 shows the solvable per- formance of the neural network with the logistic map chaos with different noise amplitude β. From the results, it can be seen that the value of the parameter a for high performances changes depending on the noise amplitude. It is obvious that temporal structure of the chaotic time series changes depending on a. Therefore, both the amplitude and the autocorrelation correlation should be investigated at the same time to analyze the effects of the additive noise. 698 M. Hasegawa and K. Umeno -0.6 -0.4
-0.2 0.0
0.2 0.4
0.6 0.8
1.0 0 2 4 6 8 10 Autocorrelation Coefficient τ a=3.82
a=3.92 a=3.95
Gausian Noise Fig. 4. Autocorrelation coefficients of chaotic noise that has high solvable performance 3 Negative Autocorrelation Noise The autocorrelation coefficients of the chaotic sequences used for the results of Figs. 1 and 2 are shown in Fig. 4. The figure shows that the autocorrela- tion of the effective chaotic sequences has a negative value at lag 1. In other research using chaos, chaotic CDMA utilizes similar sequences whose autocor- relation is C(τ ) ≈ C × (−r) τ [12]. Such autocorrelation has been shown to lead small cross-correlation among the sequences. Therefore, the chaotic codes having such autocorrelation are effective for reducing bit error rates caused by mutual interferences. In this paper, we introduce stochastic noise having such autocor- relation C(τ ) ≈ C × (−r) τ to evaluate effects of the negative autocorrelation to the solvable performance of the neural networks, because it is similar to effective autocorrelation function of the chaotic noise shown in Fig. 4 and possible to analyze only the effect of such negative dumped oscillation in autocorrelation. We generate such stochastic sequence whose autocorrelation is C(τ ) by the following procedures. First, a target autocorrelation sequence C(τ ) is generated. By applying FFT to C(τ ), its power spectrum sequence is obtained. Only the phase of the spectrum is randomized without changing the power spectrum. Then, a stochastic sequence, whose autocorrelation was C(τ ), is generated by applying IFFT to the phase randomized sequence. When we apply the generated noise sequence to the neural network as z ik (t) in Eq. (10), the sequence is nor- malized to zero mean and unit variance as already described. We evaluate the solvable performance of such stochastic noise by changing the autocorrelation parameter r and the noise amplitude β. Figures 5 and 6 show the results on the TSP and the QAP solved by the neural networks with the stochastic noise whose autocorrelation is C(τ ) ≈ C × (−r) τ .
β and the negative correlation parameter r are changed, and the lower figures (2) show another view from the r axis direction. Solvable Performances of Optimization Neural Networks 699
0.0 0.1
0.2 0.3
0.4 0.5
0.6 β -1.0 -0.5 0.0
0.5 1.0
r 0 20 40 60 80 100 Solvable Performance (%) (1) 0
20 30 40 50 60 70 80 90 100 -1.0 -0.8
-0.6 -0.4
-0.2 0.0
0.2 0.4
0.6 0.8
1.0 Solvable Performance (%) r (2)
Fig. 5. Solvable performance of a neural network with a stochastic additive noise whose autocorrelation is C(τ ) ≈ C × (−r) τ , on a 20-city TSP, with changing the noise am- plitude β and the negative autocorrelation parameter r Figures 5 (1) and 6 (1) demonstrate that the best amplitude β for the best solvable performance varies depending on the negative autocorrelation parameter r. The best β increases as r increases. In the conventional stochastic resonance, only the amplitude of the noise has been considered important for its phenomena, and the temporal structure such as autocorrelation has not been so much focused on. However, the results shown in Fig. 5 (1) and Fig. 6 (1) suggest that the temporal structure of the additive noise is also important when we analyze effect of the additive noise. Figures 5 (2) and 6 (2) clearly demonstrate that positive r (autocorrelation is negative) induces higher performance. A negative autocorrelation with oscil- lation corresponding to r > 0 has higher performance than the white Gaussian noise corresponding to r = 0 and positive autocorrelation noise correspond- ing to r < 0. Comparing the best results of Fig. 5 and 6 with the results in Figs. 1 and 2, we can see that the stochastic noise with negative autocorrelation
700 M. Hasegawa and K. Umeno 0.02 0.04
0.06 0.08
0.10 0.12
0.14 0.16
β -1.0
-0.5 0.0
0.5 1.0
r 0 5 10 15 20 Solvable Performance (%) (1)
0 5 10 15 20 -1.0 -0.8 -0.6
-0.4 -0.2
0.0 0.2
0.4 0.6
0.8 1.0
Solvable Performance (%) r (2) Fig. 6. Solvable performance of a neural network with a stochastic additive noise whose autocorrelation is C(τ ) ≈ C × (−r) τ , on a QAP with 12 nodes (Nug12), with changing the noise amplitude β and the negative autocorrelation parameter r has almost the same high performance as the chaotic noise. As shown in Fig. 4, the chaotic noise also has a negative autocorrelation with damped oscillation, similar to the introduced stochastic noise, C(τ ) ≈ C × (−r) τ . Based on these results, we conclude that the effects of the chaotic dynamics for the additive se- quences to the optimization neural network owes to its negative autocorrelation. 4 Conclusion Chaotic noise has been shown to be more effective than white Gaussian noise in conventional chaotic noise approaches to combinatorial optimization problems. We obtained the results indicating this in Figs. 1 and 2. In a previous research, some specific characteristics of autocorrelation function of chaotic noise was shown to be effective by using the method of surrogate data [9].
Solvable Performances of Optimization Neural Networks 701
In order to investigate more essential effect of the additive noise, we introduced stochastic noise whose autocorrelation is C(τ ) ≈ C × (−r) τ , because the chaotic sequences effective for optimization has similar autocorrelation with negative dumped oscillation. By such noise, we showed that the solvable performance de- pends not only on the noise amplitude but also on the negative autocorrelation parameter r. The best value of amplitude for additive noise varies depending on the autocorrelation of the noise. We also showed that the noise, whose autocorre- lation takes negative value with damped oscillation, is more effective than white Gaussian noise and positive autocorrelation noise. Moreover, we also showed that such a stochastic noise having autocorrelation with dumped oscillation lead to almost the same high solvable performance as the chaotic noise. From these results, we conclude that essential effect of the specific autocorrelation of chaotic noise shown in Ref. [9] is this negative autocorrelation with dumped oscillation. In the chaotic CDMA, such noise sequences, whose autocorrelation take nega- tive value with dumped oscillation, have been shown to be effective for minimiz- ing the cross-correlation among the sequences. This paper showed that adding such a set of noise sequences whose cross-correlation is low leads to high per- formance of the neural network. To clarify this effect of the noise related to cross-correlation to solvable performance is very important future work. In this paper, we dealt with just one approach of chaotic optimization, which is based on mutually connected neural networks. However, this approach has poor performance compared to a heuristic approach using chaos [3,6,5]. Therefore, it is also important future work to create a new chaotic algorithm in a heuristic approach, using the obtained results about effective feature of chaotic dynamics. References 1. Nozawa, H.: A neural network model as a globally coupled map and applications based on chaos. Chaos 2, 377–386 (1992) 2. Chen, L., Aihara, K.: Chaotic Simulated Annealing by a Neural Network Model with Transient Chaos. Neural Networks 8(6), 915–930 (1995) 3. Hasegawa, M., Ikeguchi, T., Aihara, K.: Combination of Chaotic Neurodynamics with the 2-opt Algorithm to Solve Traveling Salesman Problems. Physical Review Letters 79(12), 2344–2347 (1997) 4. Ishii, S., Niitsuma, H.: Lambda-opt neural approaches to quadratic assignment problems. Neural Computation 12(9), 2209–2225 (2000) 5. Hasegawa, M., Ikeguchi, T., Aihara, K., Itoh, K.: A Novel Chaotic Search for Quadratic Assignment Problems. European Journal of Operational Research 139, 543–556 (2002) 6. Hasegawa, M., Ikeguchi, T., Aihara, K.: Solving Large Scale Traveling Salesman Problems by Chaotic Neurodynamics. Neural Networks 15, 271–283 (2002) 7. Hasegawa, M., Ikeguchi, T., Matozaki, T., Aihara, K.: Solving Combinatorial Opti- mization Problems by Nonlinear Neural Dynamics. In: Proc. of IEEE International Conference on Neural Networks, pp. 3140–3145 (1995) 8. Hayakawa, Y., Marumoto, A., Sawada, Y.: Effects of the Chaotic Noise on the Per- formance of a Neural Network Model for Optimization Problems. Physical Review E 51(4), 2693–2696 (1995)
702 M. Hasegawa and K. Umeno 9. Hasegawa, M., Ikeguchi, T., Matozaki, T., Aihara, K.: An Analysis on Additive Effects of Nonlinear Dynamics for Combinatorial Optimization. IEICE Trans. Fun- damentals E80-A(1), 206–212 (1997) 10. Umeno, K., Kitayama, K.: Spreading Sequences using Periodic Orbits of Chaos for CDMA. Electronics Letters 35(7), 545–546 (1999) 11. Mazzini, G., Rovatti, R., Setti, G.: Interference minimization by autocorrelation shaping in asynchronous DS-CDMA systems: chaos-based spreading is nearly op- timal. Electronics Letters 35(13), 1054–10555 (1999) 12. Umeno, K., Yamaguchi, A.: Construction of Optimal Chaotic Spreading Sequence Using Lebesgue Spectrum Filter. IEICE Trans. Fundamentals E85-A(4), 849–852 (2002) 13. Hopfield, J., Tank, D.: Neural Computation of Decision in Optimization Problems. Biological Cybernetics 52, 141–152 (1985) 14. Burkard, R., Karish, S., Rendl, F.: QAPLIB-A Quadratic Assignment Problem Library. Journal of Global Optimization 10, 391–403 (1997), http://www.seas.upenn.edu/qaplib Solving the k-Winners-Take-All Problem and the Oligopoly Cournot-Nash Equilibrium Problem Using the General Projection Neural Networks Xiaolin Hu 1 and Jun Wang 2 1 Tsinghua National Lab of Information Science & Technology, State Key Lab of Intelligent Technology & Systems, and Department of Computer Science & Technology, Tsinghua University, Beijing 100084, China xiaolin.hu@gmail.com 2 Department of Mechanical and Automation Engineering The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, China jwang@mae.cuhk.edu.hk Abstract. The k-winners-take-all (k-WTA) problem is to select k largest inputs from a set of inputs in a network, which has many applications in machine learning. The Cournot-Nash equilibrium is an important problem in economic models . The two problems can be formulated as linear vari- ational inequalities (LVIs). In the paper, a linear case of the general pro- jection neural network (GPNN) is applied for solving the resulting LVIs, and consequently the two practical problems. Compared with existing re- current neural networks capable of solving these problems, the designed GPNN is superior in its stability results and architecture complexity. 1 Introduction Following the seminal work of Hopfield and Tank [1], numerous neural network models have been developed for solving optimization problems, from the earlier proposals such as the neural network proposed by Kennedy and Chua [2] based on the penalty method and the switched-capacitor neural network by Rodr´ıguez- V´ azquez et al. [3] to the latest development by Xia and Wang et al. [5, 8]. In recent years, new research in this direction focus on solving variational inequality (VI), a problem closely related to optimization problems, which has itself many applications in a variety of disciplines (see, e.g., [13]). Regarding solving VIs, a recurrent neural network, called projection neural network is developed (see [5,8] and references therein). An extension of this neural network is presented in [6], termed general projection neural network (GPNN), which is primarily designed for solving general variational inequality (GVI) with bound constraints on vari- ables. Based on that work, it is found that the GPNN with proper formulations is capable of solving a linear case of VI (for short, LVI) with linear constraints. In view of this observation, we then apply the formulated GPNNs for solving two typical LVI problems, k-WTA problem and oligopoly Cournot-Nash equilibrium problem, whose descriptions can be found in Section 3 and Section 4, respectively. M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 703–712, 2008. c Springer-Verlag Berlin Heidelberg 2008 704 X. Hu and J. Wang 2 Neural Network Model Consider the following linear variational inequality (LVI): find x ∗ ∈ Ω such that (M x ∗ + p) T (x − x
∗ ) ≥ 0, ∀x ∈ Ω, (1) where M ∈ n×n , p ∈
n and
Ω = {x ∈ n |Ax ∈ Y, Bx = c, x ∈ X } (2) with A ∈
m×n , B ∈
r×n , c ∈
r . In above, X and Y are two box sets defined as X = {x ∈ n |x ≤ x ≤ x}, Y = {y ∈ m |y ≤ y ≤ y}, where x, x ∈ n , y, y ∈ m . Note that any component of x, y may be set to −∞ and any com- ponent of x, y may be set to ∞. Without loss of generality, we assume y < y since if y i = y
i for some i, then the corresponding inequality constraints can be incorporated into Bx = c. Denote Ω ∗ as the solution set of (1). Throughout the paper, it is assumed that Ω ∗ = ∅. Moreover, we assume Rank(B) = r in (2), which is always true for a well-posed problem. The above LVI is closely related to the following quadratic programming problem:
minimize 1 2 x T M x + p T x subject to x ∈ Ω (3) where the parameters are defined as the same as in (1). It is well-known that (e.g., [13]), if M is symmetric and positive semi-definite, the two problems are actually equivalent. If M = 0, the above problem degenerates to a linear programming problem. Since Rank(B) = r, without loss of generality, B can be partitioned as [B I
II ], where B I ∈
, B II ∈ r×(n−r) and det(B I ) = 0. Then Bx = c can be decomposed into (B I , B II ) x I x II = c,
where x I ∈ r , x
II ∈ n−r , which yields x I = −B −1 I B II x II + B
−1 I c and x = Qx II + q, (4) where
Q = −B −1 I B II I ∈ n×(n−r) , q = B −1 I c O ∈ n . (5) Similarly, we can have x ∗ = Qx
∗ II + q. Substitution x and x ∗ into (1) will give a new LVI. Let u = x
II , ¯
M = Q T M Q, ¯ p = Q T M q + Q T p, ¯
A = AQ −B −1 I B II , V = v ∈ m+r y − Aq
x I − B −1 I c ≤ v ≤ y − Aq
x I − B −1 I c , U = {u ∈
n−r |x II ≤ u ≤ x II } Solving the k-WTA Problem 705 where x
I = (x
1 , · · · , x r )
, x II = (x r+1 , · · · , x n )
, x I = (x 1 , · · · , x r )
, x II = (x r+1
, · · · , x n ) T , then the new LVI can be rewritten in the following compact form ( ¯
M u ∗ + ¯ p) T (u − u ∗ ) ≥ 0, ∀u ∈ ¯ Ω, (6)
where ¯ Ω = {u ∈ n−r | ¯
Au ∈ V, u ∈ U}. Compared with the original LVI (1) the new formulation (6) has fewer variables while the equality constraints are absorbed. By following similar techniques as in [10], it is not difficult to obtain the following theorem. Theorem 1. u ∗ ∈
is a solution of (6) if and only if there exists v ∗ ∈ m+r such that w = (u T , v
T ) T satisfies ˜ Nw ∗ ∈ W and ( ˜
M w ∗ + ˜ p) T (w − ˜ N w ∗ ) ≥ 0, ∀w ∈ W, (7)
where ˜ M = ¯ M − ¯
A T O I , ˜
N = I O
¯ A O
, ˜ p =
¯ p O , W = U × V. Inequality (7) represents a class of general variational inequality [12]. Thus the following general projection neural network presented in [6] can be applied to solve the problem. – State equation dw dt = λ( ˜ M + ˜
N ) T {− ˜ Nw + P W (( ˜ N − ˜ M)w − ˜
p)}, (8a)
– Output equation x = Qu + q, (8b) where λ > 0 is a scalar, Q, q are defined in (5) and P W ( ·) is a standard projection operator [5, 8]. The architecture of the neural network can be drawn in a similar fashion in [6], and is omitted here for brevity. Clearly, when ¯ M ≥ 0, ˜
M + ˜ N is nonsingular, and ˜ M T
N is positive semi- definite. Then we have the following theorem which follows from [6, Corollary4] directly. Theorem 2. Consider neural network (8) for solving LVI (6) with Ω defined in (2). (i) If ¯
M ≥ 0, then the state of the neural network is stable in the sense of Lyapunov and globally convergent to an equilibrium, which corresponds to an exact solution of the LVI by the output equation. (ii) Furthermore, if ¯ M > 0, then the output x(t) of the neural network is globally asymptotically stable at the unique solution x ∗ of the LVI. Other neural networks existing in the literature can be also adopted to solve (6). In Table 1 we compare the proposed GPNN with several salient neural network models in terms of the stability conditions and the number of neurons. It is seen
706 X. Hu and J. Wang Table 1. Comparison with Several Salient Neural Networks for Solving LVI (6) Refs. [7] Ref. [4] Ref. [11] Present paper Conditions ¯ M > 0
¯ M ≥ 0, ¯
M T = ¯ M ¯ M > 0, ¯
M T = ¯ M ¯ M ≥ 0 Neurons n + 2m + r n + 2m + r n + m
n + m that the GPNN requires the weakest conditions and fewest neurons. Note that fewer neurons imply fewer connective weights in the network, and consequently lower structural complexity of the network. 3 k-Winners-Take-All Network Consider the following k-winners-take-all (k-WTA) problem x i = f (σ i ) = 1, if σ i ∈ {k largest elements of σ} 0, otherwise, where σ ∈ n stands for the input vector and x ∈ {0, 1} n stands for the output vector. The k-WTA operation accomplishes a task of selecting k largest inputs from n inputs in a network. It has been shown to be a computationally powerful operation compared with standard neural network models with threshold logic gates [14]. In addition, it has important applications in machine learning as well, such as k-neighborhood classification, k-means clustering. Many attempts have been made to design neural networks to perform the k-WTA operation [15, 16, 17]. Recently, the problem was formulated into a quadratic programming problem and a simplified dual neural network was de- veloped to k-WTA with n neurons [11]. The limitation of that approach lies in the finite resolution. Specifically, when the k-th largest input σ k is equal to the (k + 1)-th largest input, the network cannot be applied. The k-WTA problem is equivalent to the following integer programming problem minimize
−σ T x subject to Bx = k, x ∈ {0, 1} n , (9) where B = (1, · · · , 1). Suppose that σ k = σ
k+1 , similar to the arguments given in [11], the k-WTA problem can be reasoned equivalent to the following linear programming problem minimize −σ T x subject to Bx = k, x ∈ [0, 1] n .
Then neural network (8) with M = 0, p = −σ, c = k, x = 0, x = 1 can be used to solve the problem. Explicitly, the neural network can be written in the following component form
Solving the k-WTA Problem 707 – State equation du i dt = λ ⎧ ⎨ ⎩ −u i + n−1
j=1 u j + P U i (u i − v − σ 1 + σ
i+1 ) + P
V ( − n−1 j=1
u j − v) ⎫ ⎬ ⎭ ∀i = 1, · · · , n − 1, dv dt = λ ⎧ ⎨ ⎩ 2 n−1 j=1 u j − n−1
j=1 P U j (u j − v − σ 1 + σ j+1 ) + P
V ( − n−1 j=1
u j − v) ⎫ ⎬ ⎭ , – Output equation x 1
− n−1
j=1 u j + k x i+1 = u i ∀i = 1, · · · , n − 1, where U = {u ∈
n−1 |0 ≤ u ≤ 1}, V = {v ∈ | − k ≤ v ≤ 1 − k}. It is seen that no parameter is needed to choose in contrast to [11]. Next, we consider another scenario. Suppose that there are totally s identical inputs and some of them but not all should be selected as “winners”. Suppose r inputs among them are needed to be selected, where r < s. Then the proposed neural network will definitely output k − r 1’s and n − (k − l) − s 0’s which correspond to σ 1 ≥ σ
2 ≥ · · · ≥ σ k−l and σ
s−l+k+1 ≥ · · · ≥ σ n . The other outputs which correspond to s identical inputs σ k−l+1
= · · · = σ s−l+k might be
neither 1’s nor 0’s, but the entire outputs x ∗ still correspond to the minimum of (10). Denote the minimum value of the objective function as p ∗ , then p ∗ = − k−l
i=1 σ i − ¯σ s−l+k
i=k−l+1 x ∗ i , where ¯ σ = σ k−l+1
= · · · = σ s−l+k . From the constraints, we have s−l+k i=k−l+1
x ∗ i = l, ∀x ∗ i ∈ [0, 1]. Therefore, the optimum value of the objective function will not change by vary- ing x
∗ i in [0, 1] ∀i = k − l + 1, · · · , s − l + k, if only the sum of these x ∗ i ’s is equal to l. Clearly, we can let any l of these outputs equal to 1’s and the rest equal to 0’s. According to this rule, the output x ∗ will be an optimum of (10) and con- sequently, an optimum of (9). So, for this particular application, a rectification module is needed which should be placed between the output layer of the GPNN and the desired output layer, as shown in Fig. 1. Example 1. In the k-WTA problem, let k = 2, n = 4 and the inputs σ i =
1 = σ
2 , where t increases from 0 to 708 X. Hu and J. Wang 1 continuously, which leads to four sinusoidal input signals σ i for i = 1, · · · , 4 with σ 1 = σ 2 (see the top subfigure in Fig. 2). We use the rectified GPNN to accomplish the k-WTA operation. The other four subfigures in Fig. 2 record the final outputs of the proposed network at each time instant t. Note that when t is between approximately [0.05, 0.15] and [0.55, 0.65], either σ 1 or σ 2 may be
selected as a winner, and the figure just shows one possible selection. General projection neural network 1
n ... 1
... n x 2
Output rectification 1
... n x 2
Fig. 1. Diagram of the rectified GPNN for k-WTA operation in Example 1. x i stands for the output of the GPNN and x i stands for the final output. 0 0.2 0.4 0.6
0.8 1 −10 0 10 σ 0 0.2
0.4 0.6
0.8 1 0 0.5 1 x 1 0 0.2 0.4 0.6
0.8 1 0 0.5 1 x 2 0 0.2 0.4 0.6
0.8 1 0 0.5 1 x 3 0 0.2 0.4 0.6
0.8 1 0 0.5 1 x 4 t σ 1 , σ 2 σ 3 σ 4 Fig. 2. Inputs and outputs of the rectified GPNN for k-WTA operation in Example 1 Solving the k-WTA Problem 709 4
In this section, we apply the GPNN for solving a spatial oligopoly model in economics formulated as a variational inequality [18, pp. 94-97]. Assume that a homogeneous commodity is produced by m firms and demanded by n markets that are generally spatially separated. Let Q ij denote the commodity shipment from firm i to demand market j; q i denote the commodity produced by firm i satisfying q i = n j=1
Q ij ; d j denote the demand for the commodity at market j satisfying d j = m i=1
Q ij ; q denote a column vector with entries q i , i = 1, · · · , m; d denote a column vector with entries d j , j = 1, · · · , n; f i denote the production cost associated with firm i, which is a function of the entire production pattern, i.e., f
i = f
i (q); p
j denote the demand price associated with market j, which is a function of the entire consumption pattern, i.e., p j = p j (d); c
ij denote the transaction cost associated with trading (shipping) the commodity from firm i to market j, which is a function of the entire shipment pattern; i.e., c ij = c
ij (Q);
and u i denote the profit or utility of firm i, given by u i (Q) = n j=1
p j Q ij − f
i − n j=1 c ij Q ij . Clearly, for a well-defined model, Q ij must be nonnegative. In addition, other constraints can be added to make the model more practical. For instance, a constrained set can be defined as follows S = {Q ij
m×n |f i ≥ 0, p j ≤ p j ≤ ¯p
j ,c ij ≥ 0, 0 ≤ Q ij ≤ ¯ Q ij , ∀i = 1, · · · , m; j = 1, · · · , n}, where p
j and ¯
p j are respectively the lower and upper bound of the price p j , and
¯ Q ij is the capacity associated with the shipment Q ij . An important issue in oligopoly problems is to determine the so-called Cournot- Nash equilibrium [18]. According to Theorem 5.1 in [18, p. 97], assume that for each firm i the profit function u i (Q) is concave with respect to the variables {Q i1 , · · · , Q in } and continuously differentiable, then Q ∗ is a Cournot-Nash equi- librium if and only if it satisfies the following VI − m i=1 n j=1 ∂u i (Q ∗ ) ∂Q ij (Q ij − Q ∗ ij ) ≥ 0 ∀Q ∈ S. (11) Then, in the linear case, this equilibrium problem can be solved by using the GPNN designed in Section 2. Example 2. In the above Cournot-Nash equilibrium problem, let m = 3, n = 5 and define the cost functions f 1 (q) = 2q 2 1 + 2q 2 q 3 + q
2 + 4q
3 + 1
f 2 (q) = 3q 2 2 + 2q 2 3 + 2q 2 q 3 + q 1 + 2q 2 + 5q
3 + 3
f 3 (q) = 2q 2 1 + q 2 3 + 6q 2 q 3 + 2q 1 + 2q 2 + q
3 + 5
710 X. Hu and J. Wang and demand price functions p 1 (d) = −3d 1 − d 2 + 5
p 2 (d) = −2d 2 + d
3 + 8
p 3 (d) = −d 3 + 6
p 4 (d) = d 2 − d
3 − 2d
4 + 6
p 5 (d) = −d 3 − d
5 + 3.
For simplicity, let the transaction costs c ij be constants; i.e., c ij = ⎛ ⎝ 5 2 3 2 9 4 8 2 1 5 6 5 4 1 4 ⎞ ⎠ .
In the constrained set S, let p
j = 1, ¯
p j = 5, ¯ Q ij = 1. By these definitions, the constraints f i ≥ 0 and c i ≥ 0 in S are automatically satisfied. By splitting Q into a column vector x ∈ 15 we can write (11) in the form of LVI (1) with M = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 8 2 3 2 2 3 0 1 0 0 3 0 1 0 0
2 6 1 1 2 0 2 −1 0 0 0 2 −1 0 0 3 1
4 3 3 0 0 1 0 0 0 0 1 0 0 2 1
3 6 2 0 −1 1 2 0 0 −1 1 2 0 2 2 3 2 4 0 0 1 0 1 0 0 1 0 1
3 0 1 0 0 9 3 4 3 3 4 1 2 1 1
0 2 −1 0 0 3 7 2 2 3 1 3 0 1 1 0 0 1 0 0 4 2 5 4 4 1 1 2 1 1
0 −1 1 2 0 3 2 4 7 3 1 0 2 3 1 0 0 1 0 1 3 3 4 3 5 1 1 2 1 2
3 0 1 0 0 6 3 4 3 3 7 1 2 1 1
0 2 −1 0 0 3 5 2 3 3 1 5 0 0 1 0 0 1 0 0 3 3 4 3 3 2 0 3 2 2
0 −1 1 2 0 3 2 4 5 3 1 0 2 5 1 0 0 1 0 1 3 3 4 3 4 1 1 2 1 3
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , p = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 −6 −3 −4 6 1 2 −2 −3 4 2 −2 −1 −4 2 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , x =
⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , x =
⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ A =
⎡ ⎢ ⎢ ⎢ ⎢ ⎣ −3 0 −1 0 0 −3 0 −1 0 0 −3 0 −1 0 0 0 −2 1 0 0 0 −2 1 0 0 0 −2 1 0 0 0 0 −1 0 0 0 0 −1 0 0 0 0 −1 0 0 0 1 −1 −2 0 0 1 −1 −2 0 0 1 −1 −2 0 0 0 −1 0 −1 0 0 −1 0 −1 0 0 −1 0 −1 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ , y = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ −4 −7 −5 −5 −2 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ , y = ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ 0 −3 −1 −1 2 ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ and B = 0, c = 0. For this LVI, there is no equality constraints. Consequently, there is no need to “eliminate” the equality constraints by reformulating the problem into another LVI as in (6). And a GPNN similar to (8) can be designed for solving the LVI directly. It can be checked that M > 0. Then, the designed GPNN is globally asymptotically stable. All simulation results show that with
Solving the k-WTA Problem 711 any positive λ and any initial state the GPNN is always convergent to the unique solution: Q ∗ = ⎛ ⎝ 0.000 1.000 0.919 0.058 0.000 −0.000 0.000 0.081 0.221 0.000 0.000 1.000 0.000 0.721 0.000 ⎞ ⎠ . Fig. 3 depicts one simulation result of the neural network (state). Clearly, all trajectories reach steady states after a period of time. It is seen that by this numerical setting, in the Cournot-Nash equilibrium state, no firm will ship any commodity to market 1 or market 5. 0 50
150 −10
−8 −6 −4 −2 0 2 4 6 8 10 Time t
States Fig. 3. State trajectories of the GPNN with a random initial point in Example 2 5 Concluding Remarks In this paper, a general projection neural network (GPNN) is designed which is capable of solving a general class of linear variational inequalities (LVIs) with lin- ear equality and two-sided linear inequality constraints. Then, the GPNN is ap- plied for solving two problems, the k-winners-take-all problem and the oligopoly Cournot-Nash equilibrium problem, which are formulated into LVIs. The de- signed GPNN is of lower structural complexity than most existing ones. Numer- ical examples are discussed to illustrate the good applicability and performance of the proposed method. Acknowledgments The work was supported by the National Natural Science Foundation of China under Grant 60621062 and by the National Key Foundation R&D Project under Grants 2003CB317007, 2004CB318108 and 2007CB311003. 712 X. Hu and J. Wang References 1. Hopfield, J.J., Tank, D.W.: Computing with Neural Circuits: A Model. Science 233, 625–633 (1986) 2. Kennedy, M.P., Chua, L.O.: Neural Networks for Nonlinear Programming. IEEE Trans. Circuits Syst. 35, 554–562 (1988) 3. Rodr´ıguez-V´ azquez, A., Dom´ınguez-Castro, R., Rueda, A., Huertas, J.L., S´ anchez-
Sinencio, E.: Nonlinear Switched-Capacitor Neural Networks for Optimization Problems. IEEE Trans. Circuits Syst. 37, 384–397 (1990) 4. Tao, Q., Cao, J., Xue, M., Qiao, H.: A High Performance Neural Network for Solv- ing Nonlinear Programming Problems with Hybrid Constraints. Physics Letters A 288, 88–94 (2001) 5. Xia, Y., Wang, J.: A Projection Neural Network and Its Application to Constrained Optimization Problems. IEEE Trans. Circuits Syst. I 49, 447–458 (2002) 6. Xia, Y., Wang, J.: A General Projection Neural Network for Solving Monotone Variational Inequalities and Related Optimization Problems. IEEE Trans. Neural Netw. 15, 318–328 (2004) 7. Xia, Y.: On Convergence Conditions of an Extended Projection Neural Network. Neural Computation 17, 515–525 (2005) 8. Hu, X., Wang, J.: Solving Pseudomonotone Variational Inequalities and Pseudo- convex Optimization Problems Using the Projection Neural Network. IEEE Trans. Neural Netw. 17, 1487–1499 (2006) 9. Hu, X., Wang, J.: A Recurrent Neural Network for Solving a Class of General Variational Inequalities. IEEE Transactions on Systems, Man and Cybernetics - Part B: Cybernetics 37, 528–539 (2007) 10. Hu, X., Wang, J.: Solving Generally Constrained Generalized Linear Variational Inequalities Using the General Projection Neural Networks. IEEE Trans. Neural Netw. 18, 1697–1708 (2007) 11. Liu, S., Wang, J.: A Simplified Dual Neural Network for Quadratic Programming with Its KWTA Application. IEEE Trans. Neural Netw. 17, 1500–1510 (2006) 12. Pang, J.S., Yao, J.C.: On a Generalization of a Normal Map and Equations. SIAM J. Control Optim. 33, 168–184 (1995) 13. Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Comple- mentarity Problems, vol. I and II. Springer, New York (2003) 14. Maass, W.: On the Computational Power of Winner-Take-All. Neural Comput. 12, 2519–2535 (2000) 15. Wolfe, W.J.: K-Winner Networks. IEEE Trans. Neural Netw. 2, 310–315 (1991) 16. Calvert, B.A., Marinov, C.: Another k-Winners-Take-All Analog Neural Network. IEEE Trans. Neural Netw. 11, 829–838 (2000) 17. Marinov, C.A., Hopfield, J.J.: Stable Computational Dynamics for a Class of Cir- cuits with O(N) Interconnections Capable of KWTA and Rank Extractions. IEEE Trans. Circuits Syst. I 52, 949–959 (2005) 18. Nagurney, A., Zhang, D.: Projected Dynamical Systems and Variational Inequali- ties with Applications. Kluwer, Boston (1996) Optimization of Parametric Companding Function for an Efficient Coding Shin-ichi Maeda 1 and Shin Ishii 2 1 Nara Institute of Science and Technology, 630–0192 Nara, Japan 2 Kyoto University, 611–0011 Kyoto, Japan Abstract. Designing a lossy source code remains one of the important topics in information theory, and has a lot of applications. Although plain vector quantization (VQ) can realize any fixed-length lossy source cod- ing, it has a serious drawback in the computation cost. Companding vector quantization (CVQ) reduces the complexity by replacing vector quantiza- tion with a set of scalar quantizations. It can represent a wide class of prac- tical VQs, while the structure in CVQ restricts it from representing every lossy source coding. In this article, we propose an optimization method for parametrized CVQ by utilizing a newly derived distortion formula. To test its validity, we applied the method especially to transform coding. We found that our trained CVQ outperforms Karhunen-Lo¨ eve transforma- tion (KLT)-based coding not only in the case of linear mixtures of uniform sources, but also in the case of low bit-rate coding of a Gaussian source. 1 Introduction The objective of lossy source coding is to reduce the coding length while the dis- tortion between the original data and the reproduction data maintains constant. It is indispensable to compressing various kinds of sources such as images and audio signals to save memory amount, and may serve useful tools for extract- ing features from high-dimensional inputs, as the well-known Karhunen-Lo¨ eve
transformation (KLT) does. Plain vector quantization (VQ) has been known to be able to realize any fixed-length lossy source coding, implying VQ can be optimal among any fixed-length lossy source coding. However, it has a serious drawback in the computation cost (encoding, memory amount, and optimization of the coder) that increases exponentially as the input dimensionality increases. Companding vector quantization (CVQ) has potential to reduce the complexity inherent in the non-structured VQ by replacing vector quantization with a set of scalar quantizations, and can represent a wide class of practically useful VQs such as transform coding, shape-gain VQ, and tree structured VQ. On the other hand, the structure in CVQ restricts it from representing every lossy source cod- ing. So far optimal CVQ has been calculated analytically for a very limited class. KLT is one of them, which is optimal for encoding Gaussian sources among the class of transform coding, in the limit of high rate [1] [2]. Although an analyti- cal derivation of optimal CVQ has been tried, numerical optimization to obtain suboptimal CVQ has been hardly done except for the case that CVQ has a lim- ited structure. In this article, we propose a general optimization method of the M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 713–722, 2008. c Springer-Verlag Berlin Heidelberg 2008 714 S. Maeda and S. Ishii parametrized CVQ, which includes bit allocation algorithm based on a newly derived distortion formula, a generalization of Bennett’s formula, and test its validity through numerical simulations. 2 Companding Vector Quantization 2.1 Architecture The architecture of our CVQ is shown in Fig. 1.
Download 12.42 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling