Lecture Notes in Computer Science

Simulation and Experiment

bet	34/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 30 31 32 33 34 35 36 37 ... 88

4 Simulation and Experiment

The first experiment is a 2-D 181x217-pixel T1-weighted MR image with its expert

segmentation result, as shown in Figs. 3(a) and 3(b). In our method, the parameter μ is

set to 0.2 and the iteration number n is only set to 2. This is because the MR image is

clear and less iteration can maintain the original gradient information. The update

iteration in the IGSOM model is about 100 in this case. The learning rate and the

standard deviation of the Gaussian function are fixed and set to 0.1 and 0.6,

respectively. From the expert partition result, boundary pixels of GM/WM (inner) and

GM/CSF (outer) interfaces can be extracted as shown in Figs. 3(c) and 3(d),

respectively. In Fig. 3(d), there are many sulci lose their intervals. Figs. 3(e) and 3(f)

show the experimental results by using SOM+LDM model [7] and IGSOM model,

respectively. Roughly, the results are similar to each other. However, some intervals

inside the sulci are over-detected in the result of earlier method. Some of them are

pointed out with arrow symbols in Fig. 3(e). These defects will cause extra and deeper

sulci spread on cortex.

(a) (b) (c)

(d) (e) (f)

Fig. 3. Experiments on a 2-D 181x217-pixel T1-weighted MR image: (a) the raw image, (b) the

expert segmentation image, (c) the inner boundary image, (d) the outer boundary image, (e) the

result by the SOM+LDM model, (f) the result by the IGSOM model

Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction

371

Then the experiment is extended to 3-D 181x217x181-voxel T1-weighted MRI

data. The cerebral cortex is separately reconstructed using SOM+LDM model [7] and

IGSOM model as well. The GVF parameter μ and the iteration number n are the same

as the former experiment. The update iteration in the IGSOM model is about 200 in

this experiment. The learning rate and the standard deviation of the Gaussian function

are also fixed and set to 0.1 and 1.0, respectively. The number of mesh network nodes

is 35,851. The six views, i.e. top, bottom, left, right, front, and back, of reconstructed

3-D cortical surface are shown in Fig. 4, where figures on the left column show the

result of SOM+LDM model while those on the right column display that of IGSOM

model. The over-detection of sulci is revealed in the result of SOM+LDM model, i.e.

the left column of Fig. 4. The measurement of error is performed by calculating the

average nearest distance between mesh vertices and points of segmented GM outer

surface. The maximal, minimal, and mean distances are 2.35, 0.0013, and 0.42 mm in

the result of SOM+LDM model, while they are 2.32, 0.0011, and 0.42 mm in that of

IGSOM model. Although both the mean distances are about the same, the maximal

and minimal distances in the result of SOM+LDM model are larger than those in that

of IGSOM model.

(a) (b)

Fig. 4. Experiments of cortex reconstruction with 3-D T1-weighted MRI data: (a)(c)(e)(g)(i)(k)

results by the SOM+LDM model, (b)(d)(f)(h)(j)(l) results by the IGSOM model, (a)(b) top

view, (c)(d) bottom view, (e)(f) left view, (g)(h) right view, (i)(j) front view, (k)(l) back view

372

C.-H. Chuang et al.

(e) (f)

(g) (h)

(i) (j)

(k) (l)

Fig. 4. (continued)

5 Conclusions

In this paper, the SOM model based on image intensity gradient is applied to

reconstruct the human cerebral cortex from MRI data. The method provides a good

Intensity Gradient Self-organizing Map for Cerebral Cortex Reconstruction

373

capability to deform the GM/WM surface outward to find out the GM/CSF surface.

Thus the asymmetric intervals inside sulci can be automatically detected. The only

requirement is the intensity bias of the MRI data should be corrected previously. The

gradient vectors computed from the image intensity are used to help the SOM model

to deform the GM/WM surface to a proper position of GM/CSF surface. Based on this

method, even the 3-D highly folded GM/CSF surface can be reconstructed. However,

in the 3-D case, one of the disadvantages is the heavy computational cost if the image

size is huge or the number of mesh vertices is large. Therefore, to improve the

computational efficiency is the future work. Also, from the reconstructed cerebral

cortex, how to systematically evaluate the accuracy of cortical surface is the urgent

study.

Acknowledgments.

The 3-D MRI data presented in this paper come from the

McConnell Brain Imaging Centre at McGill University.

References

1. Miller, M.I., Hosakere, M., Barker, A.R., Priebe, C.E., Lee, N., Ratnanather, J.T., Wang,

L., Gado, M., Morris, J.C., Csernansky, J.G.: Labeled Cortical Mantle Distance Maps of

the Cingulate Quantify Differences between Dementia of the Alzheimer Type and Healthy

Aging. Proc. Natl. Acad. Sci. 100, 15172–15177 (2003)

2. Chung, M.K., Robbins, S.M., Dalton, K.M., Davidson, R.J., Alexander, A.L., Evans, A.C.:

Cortical Thickness Analysis in Autism with Heat Kernel Smoothing. Neuroimage 25,

1256–1265 (2005)

3. MacDonald, D., Kabani, N., Avis, D., Evans, A.C.: Automated 3-D Extraction of Inner

and Outer Surfaces of Cerebral Cortex from MRI. NeuroImage 12, 340–356 (2000)

4. Barta, P., Miller, M.I., Qiu, A.: A Stochastic Model for Studying the Laminar Structure of

Cortex from MRI. IEEE Trans. Med. Imaging 24, 728–742 (2005)

5. Xu, C., Pham, D.L., Rettmann, M.E., Yu, D.N., Prince, J.L.: Reconstruction of the Human

Cerebral Cortex from Magnetic Resonance Images. IEEE Trans. Med. Imaging 18, 467–

480 (1999)

6. Xu, C., Prince, J.L.: Snakes, Shapes, and Gradient Vector Flow. IEEE Trans. Image

processing 7, 359–369 (1998)

7. Chuang, C.H., Cheng, P.E., Liou, M., Liou, C.Y., Kuo, Y.T.: Application of Self-

Organizing Map (SOM) for Cerebral Cortex Reconstruction. International Journal of

Computational Intelligence Research 3, 26–30 (2007)

8. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)

9. Liou, C.Y., Tai, W.P.: Conformal Self-Organization for Continuity on a Feature Map.

Neural Networks 12, 893–905 (1999)

10. Liou, C.Y., Tai, W.P.: Conformality in the Self-Organization Network. Artificial

Intelligence 116, 265–286 (2000)

11. Su, H.-R., Liou, M., Cheng, P.E., Aston, J.A.D., Chuang, C.H.: MR Image Segmentation

Using Wavelet Analysis Techniques. Neuroimage 26(1) (2005), Human Brain Mapping

2005 Abstract

M. Ishikawa et al. (Eds.): ICONIP 2007, Part I, LNCS 4984, pp. 374–384, 2008.

Feature Subset Selection Using Constructive

Neural Nets with Minimal Computation by

Measuring Contribution

Md. Monirul Kabir

, Md. Shahjahan

, and Kazuyuki Murase

1,2

1

Department of Human and Artificial Intelligence Systems, Graduate School of Engineering

kabir@synapse.his.fukui-u.ac.jp

Research and Education Program for Life Science

University of Fukui, Bunkyo 3-9-1, Fukui 910-8507, Japan

murase@synapse.his.fukui-u.ac.jp

Department of Electrical and Electronic Engineering

Khulna University of Engineering and Technology, Khulna 9203, Bangladesh

jahan@mail.kuet.ac.bd

Abstract. In this paper we propose a new approach to select feature subset

based on contribution of input attributes in a three-layered feedforward neural

network (NN). Three techniques: constructive, contribution, and backward

elimination are integrated together in this method. Initially, to determine the

minimal NN architecture, the number of hidden neurons is determined by a

constructive approach. After that, one-by-one removal of input attributes is

performed to compute their contribution. Finally, a sequential backward

elimination is used to generate relevant feature subset from the original input

space. The elimination process is continued depending on a criterion. To

evaluate the proposed method, we applied it to four real-world benchmark

problems. Experimental results confirmed that, the proposed method

significantly reduces the irrelevant features without degrading the network

performance and generates the feature subset with good generalization ability.

Keywords: Feature subset selection, neural network, classification, contribution.

1 Introduction

Feature subset selection (FSS) among the huge number of dataset is a crucial task for

the success of neural network (NN). The selection of effective model and then

obtaining the more descriptive data is essential to achieve good performance in

classification with NN.

As wrapper model always outperforms filter model [1][2], a numerous attempts

have been done to solve the FSS problem in wrapper approach based on different

criteria such as [3]-[9]. An extensive discussion regarding wrapper model has been

introduced in Kohavi’s method [3]. This approach is capable of producing a minimal

set of input variables, but the cost grows exponentially in the face of many irrelevant

variables. A better solution has also been reported for FSS by Setiono [4] using NN

Feature Subset Selection Using Constructive Neural Nets

375

but it failed to reduce the computational cost due to using a large fixed number of

hidden neuron (HN). The concept of contribution is implemented for FSS by Linda

[5] and S. Guan [6]. Linda uses a procedure to compute the contribution of each

attribute (feature) using weights. In contrast, S. Guan considers separate training of

each input attribute using a separate constructive NN model to do so, though it failed

to reduce the computational cost. Two other solutions for FSS can be found in [7][8].

According to the above discussions, none of them except [5] and [6], has

emphasized for FSS by measuring contribution either in backward search model or

constructive approach. Since backward search always shows better performance over

forward search [2], in this paper, we introduce a method called contribution based

feature subset selection method (CFSS) that is comprised by three techniques:

constructive, contribution, and backward elimination.

It is known that the constructive approach can achieve both the compact

architecture and better generalization [10]. We therefore initially use a constructive

algorithm, starting with a minimal network and then adding new hidden layer neurons

during training. After that, in order to find out the contribution of input individuals,

removal of one-by-one input attributes is implemented. In the latter part, heuristic

search namely sequential backward elimination (BE) is used by utilizing four new

criteria: a) double elimination, b) backtracking, c) single elimination, and d) subset

validation. Subset generation process is quickened and accomplished accurately by

incorporating the above four steps in BE. In order to evaluate CFSS, we tested it on

four real world benchmark problems such as breast cancer, diabetes, glass, and iris

problems. The experimental result exhibited that CFSS works well in the real world

classification problems and makes a compact NN architecture.

The remainder of this paper is organized as follows. Section 2 describes CFSS in

details. Section 3 presents the results of experimental analysis. A short discussion and

conclusions are given in Section 4 and Section 5 respectively.

2 The Proposed Method

In feature selection, the wrapper approaches always suffer from two major problems:

high computational cost [1], and instability due to changing weights randomly [9]. In

order to remedy mainly the first one, we emphasize on the compact network

architecture. As we know, if the size of the network becomes larger, then expensive

computation takes place during the process and the performance of the network is

degraded. Therefore, we implemented a constructive algorithm in CFSS and the

combination of thrice, constructive, contribution, and backward elimination (BE),

exhibits the completeness of CFSS efficiently. Before describing the actual algorithm,

we define some parameters used in the algorithm such as stopping criteria,

measurement of contribution, and so on.

2.1 Stopping Criteria (SC)

We used in this method a NN that is trained by back propagation learning (BPL)

algorithm [11]. New criteria are adopted for stopping the training process early in

order to improve the generalization ability and to reduce the training cost. Three

376

M.M. Kabir, M. Shahjahan, and K. Murase

stopping criteria used in CFSS are summarized in this section. The details of training

process and SC are available in [12].

2.1.1 Stopping Criterion for Adding Hidden Neurons (HNs)

During the course of adding HN, the SC is adopted as

min

−

>

p

E

E

. That is, when

the last HN is added, the minimum training error E

min

should be larger than the

previously achieved minimum training error E

p-min

.

2.1.2 Stopping Criterion for Backward Elimination (BE)

2.1.2.1 Calculation of Error Reduction Rate (ERR)

During BE process, we need to calculate the ERR as follows,

min

min

E

E

E

ERR

−

′

−

Here,

min

′

is the minimum training error during backward elimination.

2.1.2.2 Stopping Criterion in double feature deletion

During the course of training, the irrelevant features are sequentially deleted double at

a time in BE: the SC is adopted as

th

ERR

<=

AND

CA

A

C

′

. Here, th refers to

threshold value equals to 0.05, A

′

and CA refer to the classification accuracies

during BE and before BE, respectively.

2.1.2.3 Stopping Criterion in single feature deletion

During the course of training, the irrelevant features are sequentially deleted single at a

time in BE: the SC is adopted as

th

ERR

<=

AND

CA

A

C

′

. Here, th refers to

threshold value equals to 0.08.

2.2 Measurement of Contribution

Finding the contribution of input attributes accurately ensures the effectiveness of

CFSS. Therefore, during the one-by-one removal training, we measure the minimum

training error,

i

E

min

for removing each ith feature. Then, calculate the contribution of

each feature as follows.

Training error difference for removing the ith features is,

i

i

diff

E

E

E

min

−

where

min

E

is the minimum training error using all features. Now, we need to calculate

the average training error difference for removing ith feature

∑

=

N

i

i

diff

avg

E

N

E

where

N is the number of features. Now, the percentage contribution of ith feature is,

avg

i

diff

avg

i

E

E

E

Con

−

100

Now, we can decide the rank of features according to

)

max(

max

i

Con

Con

The worst ranking features are treated as the irrelevant features for the network as

these are providing comparatively less contribution for the network.

Feature Subset Selection Using Constructive Neural Nets

377

2.3 Calculation of Network Connection

The number of connections (C) of the final architecture of the network can be

calculated in terms of the number of existing input attributes (x), number of existing

hidden units (h), and number of outputs (o) by

o

h

o

h

h

x

C

)

(

)

(

2.3.1 Reduction of Network Connection (RNC)

In this context we here estimate how much connection has been reduced due to

feature selection. In this regard, we initially calculate the total number of connections

of the achieved network before and after BE,

before

C

and

after

C

, respectively. After

that, we estimate RNC as,

before

after

before

C

C

C

RNC

−

100

.

2.3.2 Increment of Network Accuracy (INA)

Due to the reduction of network connection, we estimate INA that shows how much

network accuracy is improved. We measure the classification accuracy before and

after BE,

before

CA

and

after

CA

, respectively. After that, we estimate INA as,

before

before

after

C

CA

CA

INA

−

100

.

2.4 The Algorithm

In this framework, CFSS comprises into three main steps that are summarized in Fig. 1:

(a) architecture determination of NN in constructive fashion (step 1-2), (b) measurement

of contribution of each feature (step 3), and (c) subset generation (step 4-8). These steps

are dependent each other and the entire process has been accomplished one after another

depending on particular criteria. The detail of each step is as follows.

Step 1)

Create a minimal NN architecture. Initially it has three layers, i.e. an input

layer, an output layer, and a hidden layer with only one neuron. Number of input and

output neurons is equal to the number of inputs and outputs of the problem. Randomly

initialize connection weights between input layer to hidden layer and hidden layer to

output layer within a range between [+1.0, -1.0].

Step 2)

Train the network by using BPL algorithm and try to achieve a minimum

training error, E

min

. Then, add a HN, retrain the network from the beginning, and

check the SC according to subsection 2.1.1. When it is satisfied, then validate the NN

to test set to calculate the classification accuracy, CA and go to step 3.

Otherwise,

continue the HN selection process.

Step 3)

Now to find out the contribution of features, we perform one-by-one

removal training of the network. Delete successively the ith feature and save the

individual minimum training error,

i

E

min

. Then, the rank of all features can be

reflected in accordance with the subsection 2.2. Then go to Step 4.

378

M.M. Kabir, M. Shahjahan, and K. Murase

yes

no

Training

Validate NN, calculate accuracy

Create an NN with minimal architecture

One-by-one training,

compute contribution

SC satisfied ?

Add HN

Single

deletion end

Training

SC satisfied ?

Backtracking

Further check

Final subset

yes

Delete double/single feature

yes

Validate subset, calculate accuracy

2

3

Fig. 1. Overall Flowchart of CFSS. Here, NN, HN and SC refer to neural network, hidden

neuron and stopping criterion respectively.

Step 4)

This stage is the first step for BE to generate the feature subset. Initially,

we attempt to delete double features of worst rank at a time to accelerate the process.

Calculate the minimum training error,

min

E

′

during training. Validate the existing

features using test set and calculate classification accuracy

A

C

′

. After that, if SC

mentioned in subsection 2.1.2.2 is satisfied, then continue. Otherwise, go to step 5.

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 30 31 32 33 34 35 36 37 ... 88