Lecture Notes in Computer Science
Simulation and Experiment
Download 12.42 Mb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- Fig. 4.
- Feature Subset Selection Using Constructive Neural Nets with Minimal Computation by Measuring Contribution
- Keywords
- 2.1.1 Stopping Criterion for Adding Hidden Neurons (HNs)
- 2.1.2 Stopping Criterion for Backward Elimination (BE)
- 2.2 Measurement of Contribution
- 2.3 Calculation of Network Connection
- 2.3.1 Reduction of Network Connection (RNC)
- 2.3.2 Increment of Network Accuracy (INA)
4 Simulation and Experiment The first experiment is a 2-D 181x217-pixel T1-weighted MR image with its expert segmentation result, as shown in Figs. 3(a) and 3(b). In our method, the parameter μ is set to 0.2 and the iteration number n is only set to 2. This is because the MR image is clear and less iteration can maintain the original gradient information. The update iteration in the IGSOM model is about 100 in this case. The learning rate and the standard deviation of the Gaussian function are fixed and set to 0.1 and 0.6, respectively. From the expert partition result, boundary pixels of GM/WM (inner) and GM/CSF (outer) interfaces can be extracted as shown in Figs. 3(c) and 3(d), respectively. In Fig. 3(d), there are many sulci lose their intervals. Figs. 3(e) and 3(f) show the experimental results by using SOM+LDM model [7] and IGSOM model, respectively. Roughly, the results are similar to each other. However, some intervals inside the sulci are over-detected in the result of earlier method. Some of them are pointed out with arrow symbols in Fig. 3(e). These defects will cause extra and deeper sulci spread on cortex.
(a) (b) (c)
(d) (e) (f) Fig. 3. Experiments on a 2-D 181x217-pixel T1-weighted MR image: (a) the raw image, (b) the expert segmentation image, (c) the inner boundary image, (d) the outer boundary image, (e) the result by the SOM+LDM model, (f) the result by the IGSOM model
Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction 371 Then the experiment is extended to 3-D 181x217x181-voxel T1-weighted MRI data. The cerebral cortex is separately reconstructed using SOM+LDM model [7] and IGSOM model as well. The GVF parameter μ and the iteration number n are the same as the former experiment. The update iteration in the IGSOM model is about 200 in this experiment. The learning rate and the standard deviation of the Gaussian function are also fixed and set to 0.1 and 1.0, respectively. The number of mesh network nodes is 35,851. The six views, i.e. top, bottom, left, right, front, and back, of reconstructed 3-D cortical surface are shown in Fig. 4, where figures on the left column show the result of SOM+LDM model while those on the right column display that of IGSOM model. The over-detection of sulci is revealed in the result of SOM+LDM model, i.e. the left column of Fig. 4. The measurement of error is performed by calculating the average nearest distance between mesh vertices and points of segmented GM outer surface. The maximal, minimal, and mean distances are 2.35, 0.0013, and 0.42 mm in the result of SOM+LDM model, while they are 2.32, 0.0011, and 0.42 mm in that of IGSOM model. Although both the mean distances are about the same, the maximal and minimal distances in the result of SOM+LDM model are larger than those in that of IGSOM model.
(c) (d) Fig. 4. Experiments of cortex reconstruction with 3-D T1-weighted MRI data: (a)(c)(e)(g)(i)(k) results by the SOM+LDM model, (b)(d)(f)(h)(j)(l) results by the IGSOM model, (a)(b) top view, (c)(d) bottom view, (e)(f) left view, (g)(h) right view, (i)(j) front view, (k)(l) back view
372 C.-H. Chuang et al.
(e) (f) (g) (h)
(i) (j)
(k) (l) Fig. 4. (continued) 5 Conclusions In this paper, the SOM model based on image intensity gradient is applied to reconstruct the human cerebral cortex from MRI data. The method provides a good
Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction 373 capability to deform the GM/WM surface outward to find out the GM/CSF surface. Thus the asymmetric intervals inside sulci can be automatically detected. The only requirement is the intensity bias of the MRI data should be corrected previously. The gradient vectors computed from the image intensity are used to help the SOM model to deform the GM/WM surface to a proper position of GM/CSF surface. Based on this method, even the 3-D highly folded GM/CSF surface can be reconstructed. However, in the 3-D case, one of the disadvantages is the heavy computational cost if the image size is huge or the number of mesh vertices is large. Therefore, to improve the computational efficiency is the future work. Also, from the reconstructed cerebral cortex, how to systematically evaluate the accuracy of cortical surface is the urgent study.
Acknowledgments. The 3-D MRI data presented in this paper come from the McConnell Brain Imaging Centre at McGill University.
1. Miller, M.I., Hosakere, M., Barker, A.R., Priebe, C.E., Lee, N., Ratnanather, J.T., Wang, L., Gado, M., Morris, J.C., Csernansky, J.G.: Labeled Cortical Mantle Distance Maps of the Cingulate Quantify Differences between Dementia of the Alzheimer Type and Healthy Aging. Proc. Natl. Acad. Sci. 100, 15172–15177 (2003) 2. Chung, M.K., Robbins, S.M., Dalton, K.M., Davidson, R.J., Alexander, A.L., Evans, A.C.: Cortical Thickness Analysis in Autism with Heat Kernel Smoothing. Neuroimage 25, 1256–1265 (2005) 3. MacDonald, D., Kabani, N., Avis, D., Evans, A.C.: Automated 3-D Extraction of Inner and Outer Surfaces of Cerebral Cortex from MRI. NeuroImage 12, 340–356 (2000) 4. Barta, P., Miller, M.I., Qiu, A.: A Stochastic Model for Studying the Laminar Structure of Cortex from MRI. IEEE Trans. Med. Imaging 24, 728–742 (2005) 5. Xu, C., Pham, D.L., Rettmann, M.E., Yu, D.N., Prince, J.L.: Reconstruction of the Human Cerebral Cortex from Magnetic Resonance Images. IEEE Trans. Med. Imaging 18, 467– 480 (1999) 6. Xu, C., Prince, J.L.: Snakes, Shapes, and Gradient Vector Flow. IEEE Trans. Image processing 7, 359–369 (1998) 7. Chuang, C.H., Cheng, P.E., Liou, M., Liou, C.Y., Kuo, Y.T.: Application of Self- Organizing Map (SOM) for Cerebral Cortex Reconstruction. International Journal of Computational Intelligence Research 3, 26–30 (2007) 8. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001) 9. Liou, C.Y., Tai, W.P.: Conformal Self-Organization for Continuity on a Feature Map. Neural Networks 12, 893–905 (1999) 10. Liou, C.Y., Tai, W.P.: Conformality in the Self-Organization Network. Artificial Intelligence 116, 265–286 (2000) 11. Su, H.-R., Liou, M., Cheng, P.E., Aston, J.A.D., Chuang, C.H.: MR Image Segmentation Using Wavelet Analysis Techniques. Neuroimage 26(1) (2005), Human Brain Mapping 2005 Abstract M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 374–384, 2008. © Springer-Verlag Berlin Heidelberg 2008 Feature Subset Selection Using Constructive Neural Nets with Minimal Computation by Measuring Contribution Md. Monirul Kabir 1 , Md. Shahjahan 3 , and Kazuyuki Murase 1,2 1
kabir@synapse.his.fukui-u.ac.jp 2 Research and Education Program for Life Science University of Fukui, Bunkyo 3-9-1, Fukui 910-8507, Japan murase@synapse.his.fukui-u.ac.jp 3 Department of Electrical and Electronic Engineering Khulna University of Engineering and Technology, Khulna 9203, Bangladesh jahan@mail.kuet.ac.bd Abstract. In this paper we propose a new approach to select feature subset based on contribution of input attributes in a three-layered feedforward neural network (NN). Three techniques: constructive, contribution, and backward elimination are integrated together in this method. Initially, to determine the minimal NN architecture, the number of hidden neurons is determined by a constructive approach. After that, one-by-one removal of input attributes is performed to compute their contribution. Finally, a sequential backward elimination is used to generate relevant feature subset from the original input space. The elimination process is continued depending on a criterion. To evaluate the proposed method, we applied it to four real-world benchmark problems. Experimental results confirmed that, the proposed method significantly reduces the irrelevant features without degrading the network performance and generates the feature subset with good generalization ability.
Feature subset selection (FSS) among the huge number of dataset is a crucial task for the success of neural network (NN). The selection of effective model and then obtaining the more descriptive data is essential to achieve good performance in classification with NN. As wrapper model always outperforms filter model [1][2], a numerous attempts have been done to solve the FSS problem in wrapper approach based on different criteria such as [3]-[9]. An extensive discussion regarding wrapper model has been introduced in Kohavi’s method [3]. This approach is capable of producing a minimal set of input variables, but the cost grows exponentially in the face of many irrelevant variables. A better solution has also been reported for FSS by Setiono [4] using NN
Feature Subset Selection Using Constructive Neural Nets 375 but it failed to reduce the computational cost due to using a large fixed number of hidden neuron (HN). The concept of contribution is implemented for FSS by Linda [5] and S. Guan [6]. Linda uses a procedure to compute the contribution of each attribute (feature) using weights. In contrast, S. Guan considers separate training of each input attribute using a separate constructive NN model to do so, though it failed to reduce the computational cost. Two other solutions for FSS can be found in [7][8]. According to the above discussions, none of them except [5] and [6], has emphasized for FSS by measuring contribution either in backward search model or constructive approach. Since backward search always shows better performance over forward search [2], in this paper, we introduce a method called contribution based feature subset selection method (CFSS) that is comprised by three techniques: constructive, contribution, and backward elimination. It is known that the constructive approach can achieve both the compact architecture and better generalization [10]. We therefore initially use a constructive algorithm, starting with a minimal network and then adding new hidden layer neurons during training. After that, in order to find out the contribution of input individuals, removal of one-by-one input attributes is implemented. In the latter part, heuristic search namely sequential backward elimination (BE) is used by utilizing four new criteria: a) double elimination, b) backtracking, c) single elimination, and d) subset validation. Subset generation process is quickened and accomplished accurately by incorporating the above four steps in BE. In order to evaluate CFSS, we tested it on four real world benchmark problems such as breast cancer, diabetes, glass, and iris problems. The experimental result exhibited that CFSS works well in the real world classification problems and makes a compact NN architecture. The remainder of this paper is organized as follows. Section 2 describes CFSS in details. Section 3 presents the results of experimental analysis. A short discussion and conclusions are given in Section 4 and Section 5 respectively. 2 The Proposed Method In feature selection, the wrapper approaches always suffer from two major problems: high computational cost [1], and instability due to changing weights randomly [9]. In order to remedy mainly the first one, we emphasize on the compact network architecture. As we know, if the size of the network becomes larger, then expensive computation takes place during the process and the performance of the network is degraded. Therefore, we implemented a constructive algorithm in CFSS and the combination of thrice, constructive, contribution, and backward elimination (BE), exhibits the completeness of CFSS efficiently. Before describing the actual algorithm, we define some parameters used in the algorithm such as stopping criteria, measurement of contribution, and so on.
We used in this method a NN that is trained by back propagation learning (BPL) algorithm [11]. New criteria are adopted for stopping the training process early in order to improve the generalization ability and to reduce the training cost. Three 376 M.M. Kabir, M. Shahjahan, and K. Murase stopping criteria used in CFSS are summarized in this section. The details of training process and SC are available in [12]. 2.1.1 Stopping Criterion for Adding Hidden Neurons (HNs) During the course of adding HN, the SC is adopted as min min
− >
E E . That is, when the last HN is added, the minimum training error E
previously achieved minimum training error E p-min .
2.1.2.1 Calculation of Error Reduction Rate (ERR) During BE process, we need to calculate the ERR as follows,
min min
min E E E ERR − ′ − =
Here, min
E ′ is the minimum training error during backward elimination. 2.1.2.2 Stopping Criterion in double feature deletion During the course of training, the irrelevant features are sequentially deleted double at a time in BE: the SC is adopted as
AND
CA A C => ′ . Here, th refers to threshold value equals to 0.05, A C ′ and CA refer to the classification accuracies during BE and before BE, respectively. 2.1.2.3 Stopping Criterion in single feature deletion During the course of training, the irrelevant features are sequentially deleted single at a time in BE: the SC is adopted as
AND
CA A C => ′ . Here, th refers to threshold value equals to 0.08. 2.2 Measurement of Contribution Finding the contribution of input attributes accurately ensures the effectiveness of CFSS. Therefore, during the one-by-one removal training, we measure the minimum training error, i E min
for removing each ith feature. Then, calculate the contribution of each feature as follows. Training error difference for removing the ith features is,
min
min − = where
min E is the minimum training error using all features. Now, we need to calculate the average training error difference for removing ith feature ∑ = = N i i diff avg E N E 1 1 where N is the number of features. Now, the percentage contribution of ith feature is,
− = . 100
Now, we can decide the rank of features according to
)
max i Con Con = . The worst ranking features are treated as the irrelevant features for the network as these are providing comparatively less contribution for the network. Feature Subset Selection Using Constructive Neural Nets 377
The number of connections (C) of the final architecture of the network can be calculated in terms of the number of existing input attributes (x), number of existing hidden units (h), and number of outputs (o) by o h o h h x C + + × + × = ) ( ) ( . 2.3.1 Reduction of Network Connection (RNC) In this context we here estimate how much connection has been reduced due to feature selection. In this regard, we initially calculate the total number of connections of the achieved network before and after BE, before C and
after C , respectively. After that, we estimate RNC as,
− = . 100
. 2.3.2 Increment of Network Accuracy (INA) Due to the reduction of network connection, we estimate INA that shows how much network accuracy is improved. We measure the classification accuracy before and after BE, before CA and
after CA , respectively. After that, we estimate INA as, before before after C CA CA INA − = . 100
. 2.4 The Algorithm In this framework, CFSS comprises into three main steps that are summarized in Fig. 1: (a) architecture determination of NN in constructive fashion (step 1-2), (b) measurement of contribution of each feature (step 3), and (c) subset generation (step 4-8). These steps are dependent each other and the entire process has been accomplished one after another depending on particular criteria. The detail of each step is as follows. Step 1) Create a minimal NN architecture. Initially it has three layers, i.e. an input layer, an output layer, and a hidden layer with only one neuron. Number of input and output neurons is equal to the number of inputs and outputs of the problem. Randomly initialize connection weights between input layer to hidden layer and hidden layer to output layer within a range between [+1.0, -1.0]. Step 2) Train the network by using BPL algorithm and try to achieve a minimum training error, E
. Then, add a HN, retrain the network from the beginning, and check the SC according to subsection 2.1.1. When it is satisfied, then validate the NN to test set to calculate the classification accuracy, CA and go to step 3. Otherwise, continue the HN selection process.
Now to find out the contribution of features, we perform one-by-one removal training of the network. Delete successively the ith feature and save the individual minimum training error, i E min
. Then, the rank of all features can be reflected in accordance with the subsection 2.2. Then go to Step 4.
378 M.M. Kabir, M. Shahjahan, and K. Murase yes no
Validate NN, calculate accuracy Create an NN with minimal architecture One-by-one training, compute contribution SC satisfied ? Add HN
Single deletion end ? Training
SC satisfied ? Backtracking Further check Final subset no yes
Delete double/single feature yes
no Validate subset, calculate accuracy 1 2
4 5 6 7 8
Fig. 1. Overall Flowchart of CFSS. Here, NN, HN and SC refer to neural network, hidden neuron and stopping criterion respectively.
This stage is the first step for BE to generate the feature subset. Initially, we attempt to delete double features of worst rank at a time to accelerate the process. Calculate the minimum training error, min
′ during training. Validate the existing features using test set and calculate classification accuracy A C ′ . After that, if SC mentioned in subsection 2.1.2.2 is satisfied, then continue. Otherwise, go to step 5. Download 12.42 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling