Lecture Notes in Computer Science

bet	27/88
Sana	16.12.2017
Hajmi	12.42 Mb.
	#22381

1 ... 23 24 25 26 27 28 29 30 ... 88

a,a

)

A

ρ

A

T

j

2

D

J

U

j

, D

j

Fig. 1. Fuzzy ART

1

k

K

m

b

B

b

B

T

k

b

D

j

J

m

(a,a

)

A

a

A

T

j

a

D

a

U

a

, D

a

U

b

, D

1

k

K

m

b

x

ab

y

b

W

ab

W

ART

(b,b

)

Fig. 2. Fuzzy ARTMAP (FAM)

until the input a changes. The above processes are iterated until ART ﬁnds an

node with S

≥ ρ. If all F

nodes are reset in a learning period, a new node

is added to F

. Therefore, ρ can be regarded as the ﬁneness of classiﬁcation.

Next, we explain the behavior of FAM. As shown in Fig.2, FAM consists of

a learning ART (ART

), a supervising ART (ART

) and a map ﬁeld (MF). In

the learning period, ART

receives a sample a

∈ [0, 1]

which may contain

noise and ART

receives the corresponding recognition code b

∈ [0, 1]

. The

vigilance parameter of ART

(i.e., ρ

) is set to the baseline value ρ

, whenever

ART

receives a new sample. If the category of ART

is designated by F

2

node

J , ART

provides F

in MF with W

J

∈

which is the same as the map

ﬁeld weight vector. If the category of ART

is designated by F

node K, ART

302

T. Kamio et al.

provides F

with y

∈

which satisﬁes y

K

= 1 and y

k=K

= 0. After

receiving W

J

and y

, F

checks the mapping from ART

to ART

≡ y

∧ W

≥ ρ

(3)

where x

is F

activity and ρ

∈ [0, 1] is the vigilance parameter of MF.

If Eq.(3) is true, MF judges that the mapping is correct. In the case of an

erroneous mapping, MF executes the match tracking (MT). MT resets F

2

node

J by increasing ρ

as follows:

= S

+ ,

(4)

where S

is the matching degree of F

node J and

is an arbitrary small posi-

tive value. Therefore, ART

selects another node or generates a new node after

MT. The above processes are iterated until Eq.(3) is satisﬁed. This means that

MT can correct an erroneous mapping. When the mapping is deemed correct,

AL-SLMAP [2] updates weight vectors as follows:

a(new)

= (1 + c

)

−1

A + (1

− (1 + c

)

−1

a(old)

a(new)

= β

∧ D

a(new)

) + (1

− β

a(old)

(5)

b(new)

= β

∧ D

b(old)

) + (1

− β

b(old)

b(new)

= β

∧ U

b(old)

) + (1

− β

b(old)

(6)

ab(new)

= β

∧ W

ab(old)

) + (1

− β

ab(old)

(7)

where c

is the count of selections of F

2

node J in correct mappings. β

, β

and β

∈ (0, 1] are learning rates. Note that β

must be set to 1, whenever F

node K learns for the ﬁrst time. All the components of weight vectors are set

to 1 before the ﬁrst update. Eqs.(5), (6), and (7) are based on AL [2], FCSR

[1] and SLMAP[3] respectively. The reason why FCSR updates D

K

and U

that FCSR can optimally learn recognition codes, which are noiseless data.

After the supervised learning is ﬁnished, ART

can be used as a recognition

system. This is because W

gives a mapping from ART

to ART

. However, if

ab

j

has more than one component which satisﬁes W

≥ ρ

, F

node j must

be deleted before using ART

as a recognition system.

2.2

Characteristics

First, we discuss the characteristics of weight vectors created by AL-SLMAP (i.e.,

Eqs.(5)-(7)). D

j

converges on the average of samples which have been classiﬁed

into F

node j. U

approximates the distribution of samples. It is expressed by

a hyper rectangle in the input space. D

k

and U

correspond to the immediate

k-th kind of recognition code b

. Recognition codes b

denote noiseless data.

Although the initialized W

relates F

node j to all the F

nodes, W

relates

node j to only one F

2

node ﬁnally.

Fuzzy ARTMAP with Explicit and Implicit Weights

303

Next, let us consider the result of Ref.[2]. Ref.[2] shows that AL-SLMAP can

inhibit the category proliferation for the character recognition problem which

consists of noisy samples and noiseless recognition codes. This result must be

obtained by satisfying the following conditions:

(a) Since Eq.(7) makes W

learn slowly, the occurrence of MT is reduced. As

a result, F

2

node j can get a lot of samples.

(b) When W

has just related F

node j to only F

node k, most of the

samples learned by F

2

node j correspond to the recognition code b

and U

(d) Since D

and U

become a good category, unnecessary MT can hardly

occur.

However, we have found two disappointing facts about AL-SLMAP. One is a

fact that AL-SLMAP cannot inhibit the category proliferation for the charac-

ter recognition problem in a highly noisy environment [4]. The other is a fact

that AL-SLMAP is less suitable for the region classiﬁcation problem than FCSR.

From these facts, we have noticed three problems. The ﬁrst problem is as follows.

When F

node j learns a certain set of samples corresponding to the identical

recognition code, the presentation order of the samples inﬂuences U

. The sec-

ond problem is that the elimination of noise from D

j

and U

is incomplete

because the condition (b) is not always satisﬁed. The third problem is a poten-

tial drawback of MT [4]. To solve these problems we change the choice strength

for ART

a

, propose FAM with explicit and implicit weights, and modify MT by

using implicit weights.

FAM with Explicit and Implicit Weights

3.1

Choice Strength for ART

As the distance between a sample and a category becomes smaller and the size of

the category grows larger, the choice strength by Eq.(1) becomes larger. However,

we think that Eq.(1) is unsuitable for the choice strength for ART

. To verify

our opinion we made F

node j learn a certain set of samples corresponding to

the identical recognition code. In each experiment, the samples were given in

diﬀerent order. Simulation results show the following. D

was always the same

vector. Although U

j

became a diﬀerent vector, the ﬂuctuation of U

was small.

From these results and the characteristics of Eq.(1) mentioned above, we have

judged that U

j

may deﬁne an improper category. To solve this problem the

distance between a sample and a category should be estimated by not A

∧ U

but A

∧ D

. In this case the bottom-up weight should be deﬁned by not a

vector U

but a scalar U

because only U

is needed. Therefore, we change

the deﬁnition of the choice strength for ART

as follows:

= A

∧ D

/(α + U

(j = 1,

· · · , m

).

(8)

304

T. Kamio et al.

ab

j1

ab

jmb

ab

jk

ab

jK

1

a

ab

g

j1

g

jk

g

jK

g

jmb

u

a

j1

a

jk

a

jK

a

jmb

a

j

, D

a

j

Fig. 3. F

node j of our proposed FAM

3.2

Explicit and Implicit Weights

When W

has just related F

node j only to F

node k, D

and U

should

be created by the samples corresponding only to b

. To achieve this request, we

propose FAM with explicit and implicit weights. As shown in Fig.3, F

node j

has a weight set (d

jk

, u

) for each F

node k. D

and U

is calculated by

k=0

) ,

(9)

k=0

) ,

(10)

where n

is the count of updates of (d

jk

, u

). All the components of d

are

set to 1 and u

is set to 2n

before the ﬁrst update. Also, g

is given by

jk

=

1, if W

≥ ρ

0, otherwise

(11)

Eqs.(9)-(11) show that (D

, U

) is deﬁned only by weight sets (d

, u

) with

= 1. Therefore, we call a weight set (d

, u

) with g

= 1 an explicit weight

and a weight set (d

, u

) with g

= 0 an implicit weight. If MF judges that

the mapping from F

node j to F

node k is correct, (d

, u

) is updated by

a(new)

= n

−1

A + (1

− n

−1

a(old)

(12)

a(new)

= n

−1

a(new)

∧ A + (1 − n

−1

a(old)

(13)

Fuzzy ARTMAP with Explicit and Implicit Weights

305

Eqs.(12) and (13) mean that (d

, u

) is updated by the samples corresponding

only to b

. That is to say, when W

has just related F

node j only to F

node

k, D

and U

are created by the samples corresponding only to b

When ART

is used as a recognition system, F

node j must be deleted if

has more than one component which satisﬁes W

≥ ρ

. Furthermore, all

the implicit weights should be deleted from the viewpoint of resource costs.

3.3

Match Tracking

After the original MT resets the activated F

2

nodes by increasing ρ

, the erro-

neous mapping from ART

to ART

is corrected. As a result, even if there are

a

2

nodes besides the reset ones, a new node may be forcibly generated. However,

there is the possibility that the erroneous mapping is corrected by the restricted

MT which resets activated F

nodes without increasing ρ

. These facts illustrate

that the original MT may needlessly generate F

nodes. To solve this problem

we propose the modiﬁed MT using implicit weights. From now on, we call the

original MT “MT

” and the restricted MT “MT

ﬁx

”.

The modiﬁed MT is the combination of MT

ﬁx

and the forcible node gener-

ation. The former is used to inhibit the increment of F

nodes. The latter is

used to correct erroneous mappings which cannot be solved by MT

ﬁx

. MT

ﬁx

the same as MT

except using ρ

= ρ

instead of Eq.(4). The forcible node

generation is executed as follows.

It is assumed that F

2

node j and F

node k are activated when the ﬁrst

erroneous mapping happens for t-th input set (a, b). At this point, an implicit

weight (d

jk

, u

) is updated by Eqs.(12) and (13), and then this update increases

by 1. That is to say, z

counts the updates of the weight set (d

jk

, u

) after

it becomes an implicit weight. Next, the occurrence rate of erroneous mappings

L(t) is calculated by

L(t) =

r/P

, if t

≥ P

otherwise

(14)

where r is the number of input sets which give rise to erroneous mappings in the

period [t

− P

R

+ 1, t]. Moreover, the change of L(t) is evaluated every P

input

sets. If L(t)

− L(t − P

) > 0, the modiﬁed MT judges that there are erroneous

mappings which cannot be solved by MT

ﬁx

. In this case, the following equation

is checked for F

nodes with only one explicit weight:

/τ

≥ χ,

(15)

where Z

is the maximal z

of implicit weights in F

node j, τ is the summation

of z

of all the implicit weights in F

layer, and χ

∈ (0, 1] is the standard for

the forcible node generation.

If F

a

2

node J has the implicit weight (d

J K

, u

J K

) satisfying Eq.(15), then a

new F

node J is generated as follows:

J k

, u

J k

, n

J k

, W

J k

, z

J k

) =

J K

, u

J K

, 1, 1, 0), if k = K

(1, 2n

, 0, 1, 0),

otherwise

(16)

306

T. Kamio et al.

However, if the same F

node J has satisﬁed Eq.(15) at the last check of Eq.(15),

the node J is modiﬁed instead of generating a new F

2

node:

J k

, u

J k

, n

J k

, W

J k

, z

J k

) =

J k

, n

, 1, 1, 0), if g

J k

= 1

(1, 2n

, 0, 1, 0), otherwise

(17)

This is because such F

node J may provide the categories around it with bad

inﬂuences. After completing these processes, the modiﬁed MT executes MT

ﬁx

.

Even if erroneous mappings happen again for t-th input set, the modiﬁed MT

executes only MT

ﬁx

Simulation Results

Simulations have been carried out to demonstrate the eﬀectiveness of our pro-

posed method (PM). For the alphabet character recognition problem, PM is

compared with FCSR [1] and AL-SLMAP [2]. The main diﬀerence of FCSR and

AL-SLMAP is the weight update method for D

J

, U

, and W

. In the case

of FCSR, D

and U

are updated by Eq.(6). However, (A, D

, U

, β

) must

be given to Eq.(6) instead of (B, D

, U

, β

). Also, W

is updated by Eq.(7)

with β

= 1.

Fig. 4. Alphabet characters

The original patterns of the alphabet characters are shown in Fig.4. Each

pattern is illustrated by a (7

×7)-pixel image. The pixel values are set to 0 for

white pixels and 1 for black ones. In the learning period, ART

receives noisy

patterns (i.e., sample data) a

∈

×7

and ART

receives the corresponding

recognition codes b

∈

. A noisy pattern a is constructed by inverting some

pixels in an original pattern selected randomly. The number of inverted pixels

depends on Hamming distance (HD). In a recognition code b, one element is

set to 1 and the others are set to 0. For instance, the code b corresponding to

the character “A” is [1, 0,

· · · , 0]. The quantity of learning data N

is 20000. In

the test period, ART

receives noisy patterns (i.e., test data) a. The quantity of

test data N

is 50000. We estimate each learning method by the learning time

(sec.), the number of generated F

nodes m

, and the recognition rate for

test data R

. The parameters of each learning method are as follows. In the

case of FCSR, α

= 0.1, β

= 0.2, ρ

= 0.5, α

= 1, β

= 1, ρ

= 1, β

= 1,

and ρ

= 1. In the case of AL-SLMAP, α

= 0.1, β

= 0.2, ρ

= 0.5, α

= 1,

= 1, ρ

= 1, β

= 0.02, and ρ

= 0.75. They are the same as Ref.[2]. In the

Fuzzy ARTMAP with Explicit and Implicit Weights

307

100

AL-SLMAP

FCSR

HD

T

L

(sec.)

AL-SLMAP

FCSR

HD

m

1000

2000

AL-SLMAP

FCSR

HD

R

T

0.4

0.6

0.8

(a) The learning time.

(b) The number of F

nodes.

Fig. 5. Simulation results for the alphabet character recognition problem

308

T. Kamio et al.

case of PM, α

= 0.1, ρ

= 0.5, α

= 1, β

= 1, ρ

= 1, β

= 0.02, ρ

= 0.75,

= 1000, P

= 800, and χ = 0.08.

Fig.5 illustrates T

, m

, and R

. Fig.5(a) and Fig.5(b) show that T

of PM

is the largest of the three methods and each method has the same m

when

HD = 0. This is a result predicted before executing simulations because PM

needs the highest calculation cost per F

2

node. However, even if HD becomes

large, PM keeps the constant m

. But m

in other learning methods increases

rapidly. As a result, PM can ﬁnish the learning much faster than the others

when HD is large. Moreover, Fig.5(b) and Fig.5(c) show that PM has better

a

and R

. As HD becomes larger, this tendency becomes stronger. Therefore,

we have concluded that PM can inhibit the category proliferation and keep high

recognition performance in a highly noisy environment.

Conclusions

AL-SLMAP is one of useful learning methods for fuzzy ARTMAP (FAM) from

the viewpoint of simplicity and performance. However, AL-SLMAP has prob-

lems in category selection, weight update, and match tracking. To solve these

problems, we have proposed FAM with explicit and implicit weights. Simulation

results have shown that our proposed method (PM) is better than FCSR and

AL-SLMAP in terms of category proliferation and recognition performance when

the sample data contains a large amount of noise. In the future, we will try to

reduce the learning calculation cost of our FAM. Moreover, we have to compare

PM with other learning methods and apply PM to more practical problems.

References

1. Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., Rosen, D.B.: Fuzzy

ARTMAP: A neural network architecture for incremental supervised learning of

analog multidimensional maps. IEEE Trans. Neural Networks 3(5), 698–713 (1992)

2. Lee, J.S., Yoon, C.G., Lee, C.W.: Improvement of recognition performance for the

fuzzy ARTMAP using average learning and slow learning. IEICE Trans. Fundamen-

tals E81-A(3), 514–516 (1998)

3. Carpenter, G.A., Grossberg, S., Reynolds, J.H.: A fuzzy ARTMAP nonparametric

probability estimator for nonstationary pattern recognition problems. IEEE Trans.

Neural Networks 6(6), 1330–1336 (1995)

4. Kamio, T., Nomura, K., Mori, K., Fujisaka, H., Haeiwa, K.: Improvement of Fuzzy

ARTMAP by Controlling Match Tracking. In: Proc. International Symposium on

Nonlinear Theory and its Applications, pp. 791–794 (2006)

Download 12.42 Mb.

Do'stlaringiz bilan baham:

1 ... 23 24 25 26 27 28 29 30 ... 88