Lecture Notes in Computer Science


  Simulation and Experiment


Download 12.42 Mb.
Pdf ko'rish
bet34/88
Sana16.12.2017
Hajmi12.42 Mb.
#22381
1   ...   30   31   32   33   34   35   36   37   ...   88

4   Simulation and Experiment 

The first experiment is a 2-D 181x217-pixel T1-weighted MR image with its expert 

segmentation result, as shown in Figs. 3(a) and 3(b). In our method, the parameter μ is 

set to 0.2 and the iteration number n is only set to 2. This is because the MR image is 

clear and less iteration can maintain the original gradient information. The update 

iteration in the IGSOM model is about 100 in this case. The learning rate and the 

standard deviation of the Gaussian function are fixed and set to 0.1 and 0.6, 

respectively. From the expert partition result, boundary pixels of GM/WM (inner) and 

GM/CSF (outer) interfaces can be extracted as shown in Figs. 3(c) and 3(d), 

respectively. In Fig. 3(d), there are many sulci lose their intervals. Figs. 3(e) and 3(f) 

show the experimental results by using SOM+LDM model [7] and IGSOM model, 

respectively. Roughly, the results are similar to each other. However, some intervals 

inside the sulci are over-detected in the result of earlier method. Some of them are 

pointed out with arrow symbols in Fig. 3(e). These defects will cause extra and deeper 

sulci spread on cortex. 

 

 



 

(a)                                         (b)                                     (c) 

 

 

 



(d)                                       (e)                                        (f) 

Fig. 3. Experiments on a 2-D 181x217-pixel T1-weighted MR image: (a) the raw image, (b) the 

expert segmentation image, (c) the inner boundary image, (d) the outer boundary image, (e) the 

result by the SOM+LDM model, (f) the result by the IGSOM model 


 

Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction 

371 

Then the experiment is extended to 3-D 181x217x181-voxel T1-weighted MRI 



data. The cerebral cortex is separately reconstructed using SOM+LDM model [7] and 

IGSOM model as well. The GVF parameter μ and the iteration number n are the same 

as the former experiment. The update iteration in the IGSOM model is about 200 in 

this experiment. The learning rate and the standard deviation of the Gaussian function 

are also fixed and set to 0.1 and 1.0, respectively. The number of mesh network nodes 

is 35,851. The six views, i.e. top, bottom, left, right, front, and back, of reconstructed 

3-D cortical surface are shown in Fig. 4, where figures on the left column show the 

result of SOM+LDM model while those on the right column display that of IGSOM 

model. The over-detection of sulci is revealed in the result of SOM+LDM model, i.e. 

the left column of Fig. 4. The measurement of error is performed by calculating the 

average nearest distance between mesh vertices and points of segmented GM outer 

surface. The maximal, minimal, and mean distances are 2.35, 0.0013, and 0.42 mm in 

the result of SOM+LDM model, while they are 2.32, 0.0011, and 0.42 mm in that of 

IGSOM model. Although both the mean distances are about the same, the maximal 

and minimal distances in the result of SOM+LDM model are larger than those in that 

of IGSOM model.  

 

  

(a)                                   (b) 



 

(c)                                   (d) 



Fig. 4. Experiments of cortex reconstruction with 3-D T1-weighted MRI data: (a)(c)(e)(g)(i)(k) 

results by the SOM+LDM model, (b)(d)(f)(h)(j)(l) results by the IGSOM model, (a)(b) top 

view, (c)(d) bottom view, (e)(f) left view, (g)(h) right view, (i)(j) front view, (k)(l) back view 


372 

C.-H. Chuang et al. 

 

(e)                                        (f) 



 

(g)                                         (h) 

 

(i)                                          (j) 



 

 

(k)                                          (l) 



Fig. 4. (continued

5   Conclusions 

In this paper, the SOM model based on image intensity gradient is applied to 

reconstruct the human cerebral cortex from MRI data. The method provides a good 


 

Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction 

373 

capability to deform the GM/WM surface outward to find out the GM/CSF surface. 



Thus the asymmetric intervals inside sulci can be automatically detected. The only 

requirement is the intensity bias of the MRI data should be corrected previously. The 

gradient vectors computed from the image intensity are used to help the SOM model 

to deform the GM/WM surface to a proper position of GM/CSF surface. Based on this 

method, even the 3-D highly folded GM/CSF surface can be reconstructed. However, 

in the 3-D case, one of the disadvantages is the heavy computational cost if the image 

size is huge or the number of mesh vertices is large. Therefore, to improve the 

computational efficiency is the future work. Also, from the reconstructed cerebral 

cortex, how to systematically evaluate the accuracy of cortical surface is the urgent 

study.  


 

Acknowledgments.

 The 3-D MRI data presented in this paper come from the 

McConnell Brain Imaging Centre at McGill University. 

References 

1.  Miller, M.I., Hosakere, M., Barker, A.R., Priebe, C.E., Lee, N., Ratnanather, J.T., Wang, 

L., Gado, M., Morris, J.C., Csernansky, J.G.: Labeled Cortical Mantle Distance Maps of 

the Cingulate Quantify Differences between Dementia of the Alzheimer Type and Healthy 

Aging. Proc. Natl. Acad. Sci. 100, 15172–15177 (2003) 

2.  Chung, M.K., Robbins, S.M., Dalton, K.M., Davidson, R.J., Alexander, A.L., Evans, A.C.: 

Cortical Thickness Analysis in Autism with Heat Kernel Smoothing. Neuroimage 25, 

1256–1265 (2005) 

3.  MacDonald, D., Kabani, N., Avis, D., Evans, A.C.: Automated 3-D Extraction of Inner 

and Outer Surfaces of Cerebral Cortex from MRI. NeuroImage 12, 340–356 (2000) 

4.  Barta, P., Miller, M.I., Qiu, A.: A Stochastic Model for Studying the Laminar Structure of 

Cortex from MRI. IEEE Trans. Med. Imaging 24, 728–742 (2005) 

5.  Xu, C., Pham, D.L., Rettmann, M.E., Yu, D.N., Prince, J.L.: Reconstruction of the Human 

Cerebral Cortex from Magnetic Resonance Images. IEEE Trans. Med. Imaging 18, 467–

480 (1999) 

6.  Xu, C., Prince, J.L.: Snakes, Shapes, and Gradient Vector Flow. IEEE Trans. Image 

processing 7, 359–369 (1998) 

7.  Chuang, C.H., Cheng, P.E., Liou, M., Liou, C.Y., Kuo, Y.T.: Application of Self-

Organizing Map (SOM) for Cerebral Cortex Reconstruction. International Journal of 

Computational Intelligence Research 3, 26–30 (2007) 

8.  Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001) 

9.  Liou, C.Y., Tai, W.P.: Conformal Self-Organization for Continuity on a Feature Map. 

Neural Networks 12, 893–905 (1999) 

10.  Liou, C.Y., Tai, W.P.: Conformality in the Self-Organization Network. Artificial 

Intelligence 116, 265–286 (2000) 

11.  Su, H.-R., Liou, M., Cheng, P.E., Aston, J.A.D., Chuang, C.H.: MR Image Segmentation 

Using Wavelet Analysis Techniques. Neuroimage 26(1) (2005), Human Brain Mapping 

2005 Abstract 



M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 374–384, 2008. 

© Springer-Verlag Berlin Heidelberg 2008 



Feature Subset Selection Using Constructive  

Neural Nets with Minimal Computation by  

Measuring Contribution  

Md. Monirul Kabir

1

, Md. Shahjahan



3

, and Kazuyuki Murase

1,2 



Department of Human and Artificial Intelligence Systems, Graduate School of Engineering 



kabir@synapse.his.fukui-u.ac.jp 

Research and Education Program for Life Science 



University of Fukui, Bunkyo 3-9-1, Fukui 910-8507, Japan 

murase@synapse.his.fukui-u.ac.jp 

Department of Electrical and Electronic Engineering 



Khulna University of Engineering and Technology, Khulna 9203, Bangladesh 

jahan@mail.kuet.ac.bd 



Abstract.  In this paper we propose a new approach to select feature subset 

based on contribution of input attributes in a three-layered feedforward neural 

network (NN). Three techniques: constructive, contribution, and backward 

elimination are integrated together in this method. Initially, to determine the 

minimal NN architecture, the number of hidden neurons is determined by a 

constructive approach. After that, one-by-one removal of input attributes is 

performed to compute their contribution. Finally, a sequential backward 

elimination is used to generate relevant feature subset from the original input 

space. The elimination process is continued depending on a criterion. To 

evaluate the proposed method, we applied it to four real-world benchmark 

problems. Experimental results confirmed that, the proposed method 

significantly reduces the irrelevant features without degrading the network 

performance and generates the feature subset with good generalization ability. 

Keywords: Feature subset selection, neural network, classification, contribution. 

1   Introduction 

Feature subset selection (FSS) among the huge number of dataset is a crucial task for 

the success of neural network (NN). The selection of effective model and then 

obtaining the more descriptive data is essential to achieve good performance in 

classification with NN. 

As wrapper model always outperforms filter model [1][2], a numerous attempts 

have been done to solve the FSS problem in wrapper approach based on different 

criteria such as [3]-[9]. An extensive discussion regarding wrapper model has been 

introduced in Kohavi’s method [3]. This approach is capable of producing a minimal 

set of input variables, but the cost grows exponentially in the face of many irrelevant 

variables. A better solution has also been reported for FSS by Setiono [4] using NN 


 

Feature Subset Selection Using Constructive Neural Nets 

375 

but it failed to reduce the computational cost due to using a large fixed number of 



hidden neuron (HN). The concept of contribution is implemented for FSS by Linda 

[5] and S. Guan [6]. Linda uses a procedure to compute the contribution of each 

attribute (feature) using weights. In contrast, S. Guan considers separate training of 

each input attribute using a separate constructive NN model to do so, though it failed 

to reduce the computational cost. Two other solutions for FSS can be found in [7][8].        

According to the above discussions, none of them except [5] and [6], has 

emphasized for FSS by measuring contribution either in backward search model or 

constructive approach. Since backward search always shows better performance over 

forward search [2], in this paper, we introduce a method called contribution based 

feature subset selection method (CFSS) that is comprised by three techniques: 

constructive, contribution, and backward elimination. 

It is known that the constructive approach can achieve both the compact 

architecture and better generalization [10]. We therefore initially use a constructive 

algorithm, starting with a minimal network and then adding new hidden layer neurons 

during training. After that, in order to find out the contribution of input individuals, 

removal of one-by-one input attributes is implemented. In the latter part, heuristic 

search namely sequential backward elimination (BE) is used by utilizing four new 

criteria: a) double elimination, b) backtracking, c) single elimination, and d) subset 

validation. Subset generation process is quickened and accomplished accurately by 

incorporating the above four steps in BE. In order to evaluate CFSS, we tested it on 

four real world benchmark problems such as breast cancer, diabetes, glass, and iris 

problems. The experimental result exhibited that CFSS works well in the real world 

classification problems and makes a compact NN architecture.  

The remainder of this paper is organized as follows. Section 2 describes CFSS in 

details. Section 3 presents the results of experimental analysis. A short discussion and 

conclusions are given in Section 4 and Section 5 respectively. 



2   The Proposed Method 

In feature selection, the wrapper approaches always suffer from two major problems: 

high computational cost [1], and instability due to changing weights randomly [9]. In 

order to remedy mainly the first one, we emphasize on the compact network 

architecture. As we know, if the size of the network becomes larger, then expensive 

computation takes place during the process and the performance of the network is 

degraded. Therefore, we implemented a constructive algorithm in CFSS and the 

combination of thrice, constructive, contribution, and backward elimination (BE), 

exhibits the completeness of CFSS efficiently. Before describing the actual algorithm, 

we define some parameters used in the algorithm such as stopping criteria, 

measurement of contribution, and so on. 

2.1   Stopping Criteria (SC)  

We used in this method a NN that is trained by back propagation learning (BPL) 

algorithm [11]. New criteria are adopted for stopping the training process early in 

order to improve the generalization ability and to reduce the training cost. Three 



376 

M.M. Kabir, M. Shahjahan, and K. Murase 

stopping criteria used in CFSS are summarized in this section. The details of training 

process and SC are available in [12]. 



2.1.1   Stopping Criterion for Adding Hidden Neurons (HNs) 

During the course of adding HN, the SC is adopted as 

min

min


>

p



E

E

. That is, when 

the last HN is added, the minimum training error E

min

  should be larger than the 

previously achieved minimum training error E



p-min



2.1.2   Stopping Criterion for Backward Elimination (BE) 



2.1.2.1   Calculation of Error Reduction Rate (ERR) 

During BE process, we need to calculate the ERR as follows, 

 

     


min

min


min

E

E

E

ERR



=

 



Here, 

min


E

is the minimum training error during backward elimination.  



2.1.2.2   Stopping Criterion in double feature deletion 

During the course of training, the irrelevant features are sequentially deleted double at 

a time in BE: the SC is adopted as 

th

ERR

<=

 AND 


CA

A

C

=>



. Here, th refers to 

threshold value equals to 0.05,  A



C

 and  CA  refer to the classification accuracies 



during BE and before BE, respectively.  

2.1.2.3   Stopping Criterion in single feature deletion 

During the course of training, the irrelevant features are sequentially deleted single at a 

time in BE: the SC is adopted as 

th

ERR

<=

 AND 


CA

A

C

=>



. Here, th refers to 

threshold value equals to 0.08.  



2.2   Measurement of Contribution 

Finding the contribution of input attributes accurately ensures the effectiveness of 

CFSS. Therefore, during the one-by-one removal training, we measure the minimum 

training error,



i

E

min


for removing each ith feature. Then, calculate the contribution of 

each feature as follows.  

Training error difference for removing the ith features is, 

i

i

diff

E

E

E

min


min

=



 

where


min

E

is the minimum training error using all features. Now, we need to calculate 

the average training error difference for removing ith feature 

=



=

N

i

i

diff

avg

E

N

E

1

1



 where 

N is the number of features. Now, the percentage contribution of ith feature is, 

 

avg

i

diff

avg

i

E

E

E

Con

=



.

100


 

Now, we can decide the rank of features according to

 

)

max(



max

i

Con

Con

=

.      



The worst ranking features are treated as the irrelevant features for the network as 

these are providing comparatively less contribution for the network. 



 

Feature Subset Selection Using Constructive Neural Nets 

377 

2.3   Calculation of Network Connection 

The number of connections (C) of the final architecture of the network can be 

calculated in terms of the number of existing input attributes (x), number of existing 

hidden units (h), and number of outputs (o) by



o

h

o

h

h

x

C

+

+



×

+

×



=

)

(



)

(

.      



2.3.1   Reduction of Network Connection (RNC) 

In this context we here estimate how much connection has been reduced due to 

feature selection. In this regard, we initially calculate the total number of connections 

of the achieved network before and after BE, 



before

C

 and 


after

C

, respectively. After 

that, we estimate RNC as,

before

after

before

C

C

C

RNC

=



.

100


.                                         

2.3.2   Increment of Network Accuracy (INA) 

Due to the reduction of network connection, we estimate INA that shows how much 

network accuracy is improved. We measure the classification accuracy before and 

after BE, 



before

CA

and


after

CA

, respectively. After that, we estimate INA as, 



before

before

after

C

CA

CA

INA

=



.

100


.        

2.4   The Algorithm 

In this framework, CFSS comprises into three main steps that are summarized in Fig. 1: 

(a) architecture determination of NN in constructive fashion (step 1-2), (b) measurement 

of contribution of each feature (step 3), and (c) subset generation (step 4-8). These steps 

are dependent each other and the entire process has been accomplished one after another 

depending on particular criteria. The detail of each step is as follows. 



   Step 1)

 Create a minimal NN architecture. Initially it has three layers, i.e. an input 

layer, an output layer, and a hidden layer with only one neuron. Number of input and 

output neurons is equal to the number of inputs and outputs of the problem. Randomly 

initialize connection weights between input layer to hidden layer and hidden layer to 

output layer within a range between [+1.0, -1.0].  



   Step  2)

 Train the network by using BPL algorithm and try to achieve a minimum 

training error, E

min

. Then, add a HN, retrain the network from the beginning, and 

check the SC according to subsection 2.1.1. When it is satisfied, then validate the NN 

to test set to calculate the classification accuracy, CA and go to step 3.



 

Otherwise, 

continue the HN selection process. 

Step 3)

 Now to find out the contribution of features, we perform one-by-one 

removal training of the network. Delete successively the ith feature and save the 

individual minimum training error, 



i

E

min


. Then, the rank of all features can be 

reflected in accordance with the subsection 2.2. Then go to Step 4. 

 


378 

M.M. Kabir, M. Shahjahan, and K. Murase 

yes

no

Training



Validate NN, calculate accuracy 

Create an NN with minimal architecture

One-by-one training, 

compute contribution

SC satisfied ?

Add HN


Single

deletion end

?

Training


SC satisfied ?

Backtracking

Further check 

Final subset 

no

yes


Delete double/single feature   

yes


no

Validate subset, calculate accuracy

1

2

3



4

5

6



7

8

 



Fig. 1. Overall Flowchart of CFSS. Here, NN, HN and SC refer to neural network, hidden 

neuron and stopping criterion respectively.

 

Step 4)

 This stage is the first step for BE to generate the feature subset. Initially, 

we attempt to delete double features of worst rank at a time to accelerate the process. 

Calculate the minimum training error,

min

E

during training. Validate the existing 



features using test set and calculate classification accuracy 

A

C

. After that, if SC 



mentioned in subsection 2.1.2.2 is satisfied, then continue. Otherwise, go to step 5. 


Download 12.42 Mb.

Do'stlaringiz bilan baham:
1   ...   30   31   32   33   34   35   36   37   ...   88




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling