C++ Neural Networks and Fuzzy Logic


Table 5.1 Conditions on Weights InputActivationOutputNeeded Condition


Download 1.14 Mb.
Pdf ko'rish
bet8/41
Sana16.08.2020
Hajmi1.14 Mb.
#126479
1   ...   4   5   6   7   8   9   10   11   ...   41
Bog'liq
C neural networks and fuzzy logic


Table 5.1 Conditions on Weights

InputActivationOutputNeeded Condition

0, 0000 < ¸

1, 0w

1

1w



1

 > ¸


0, 1w

2

1w



2

 > ¸


1, 1w

1

 + w



2

0w

1



 + w

2

< ¸

From the first three conditions, you can deduce that the sum of the two weights has to be greater than ¸, which

has to be positive itself. Line 4 is inconsistent with lines 1, 2, and 3, since line 4 requires the sum of the two

weights to be less than ¸. This affirms the contention that it is not possible to compute the XOR function with

C++ Neural Networks and Fuzzy Logic:Preface

XOR Function and the Perceptron

81


a simple perceptron.

Geometrically, the reason for this failure is that the inputs (0, 1) and (1, 0) with which you want output 1, are

situated diagonally opposite each other, when plotted as points in the plane, as shown below in a diagram of

the output (1=T, 0=F):

 F     T

 T     F


You can’t separate the T’s and the F’s with a straight line. This means that you cannot draw a line in the plane

in such a way that neither (1, 1) −>F nor (0, 0)−>F is on the same side of the line as (0, 1) −>T and (1, 0)−> T.



Linear Separability

What linearly separable means is, that a type of a linear barrier or a separator—a line in the plane, or a plane

in the three−dimensional space, or a hyperplane in higher dimensions—should exist, so that the set of inputs

that give rise to one value for the function all lie on one side of this barrier, while on the other side lie the

inputs that do not yield that value for the function. A hyperplane is a surface in a higher dimension, but with a

linear equation defining it much the same way a line in the plane and a plane in the three−dimensional space

are defined.

To make the concept a little bit clearer, consider a problem that is similar but, let us emphasize, not the same

as the XOR problem.

Imagine a cube of 1−unit length for each of its edges and lying in the positive octant in a xyz−rectangular

coordinate system with one corner at the origin. The other corners or vertices are at points with coordinates (0,

0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), and (1, 1, 1). Call the origin O, and the seven points listed

as A, B, C, D, E, F, and G, respectively. Then any two faces opposite to each other are linearly separable

because you can define the separating plane as the plane halfway between these two faces and also parallel to

these two faces.

For example, consider the faces defined by the set of points O, A, B, and C and by the set of points D, E, F,

and G. They are parallel and 1 unit apart, as you can see in Figure 5.1. The separating plane for these two

faces can be seen to be one of many possible planes—any plane in between them and parallel to them. One

example, for simplicity, is the plane that passes through the points (1/2, 0, 0), (1/2, 0, 1), (1/2, 1, 0), and (1/2,

1, 1). Of course, you need only specify three of those four points because a plane is uniquely determined by

three points that are not all on the same line. So if the first set of points corresponds to a value of say, +1 for

the function, and the second set to a value of –1, then a single−layer Perceptron can determine, through some

training algorithm, the correct weights for the connections, even if you start with the weights being initially all

0.

Figure 5.1

  Separating plane.

Consider the set of points O, A, F, and G. This set of points cannot be linearly separated from the other

vertices of the cube. In this case, it would be impossible for the single−layer Perceptron to determine the

C++ Neural Networks and Fuzzy Logic:Preface

Linear Separability

82


proper weights for the neurons in evaluating the type of function we have been discussing.

Previous Table of Contents Next

Copyright ©

 IDG Books Worldwide, Inc.

C++ Neural Networks and Fuzzy Logic:Preface

Linear Separability

83


C++ Neural Networks and Fuzzy Logic

by Valluru B. Rao

MTBooks, IDG Books Worldwide, Inc.



ISBN: 1558515526   Pub Date: 06/01/95

Previous Table of Contents Next



A Second Look at the XOR Function: Multilayer Perceptron

By introducing a set of cascaded Perceptrons, you have a Perceptron network, with an input layer, middle or

hidden layer, and an output layer. You will see that the multilayer Perceptron can evaluate the XOR function

as well as other logic functions (AND, OR, MAJORITY, etc.). The absence of the separability that we talked

about earlier is overcome by having a second stage, so to speak, of connection weights.

You need two neurons in the input layer and one in the output layer. Let us put a hidden layer with two

neurons. Let w

11

, w



12

, w

21

, and w



22

, be the weights on connections from the input neurons to the hidden layer

neurons. Let v

1

, v



2

 , be the weights on the connections from the hidden layer neurons to the outout neuron.

We will select the w’s (weights) and the threshold values ¸

1

 , and ¸



2

 at the hidden layer neurons, so that the

input (0, 0) generates the output vector (0, 0), and the input vector (1, 1) generates (1, 1), while the inputs (1,

0) and (0, 1) generate (0, 1) as the hidden layer output. The inputs to the output layer neurons would be from

the set {(0, 0), (1, 1), (0, 1)}. These three vectors are separable, with (0, 0), and (1, 1) on one side of the

separating line, while (0, 1) is on the other side.

We will select the s (weights) and Ä, the threshold value at the output neuron, so as to make the inputs (0, 0)

and (1, 1) cause an output of 0 for the network, and an output of 1 is caused by the input (0, 1). The network

layout within the labels of weights and threshold values inside the nodes representing hidden layer and output

neurons is shown in Figure 5.1a. Table 5.2 gives the results of operation of this network.



Figure 5.1a

  Example network.



Table 5.2 Results for the Perceptron with One Hidden Layer.

InputHidden Layer ActivationsHidden Layer OutputsOutput Neuron activatonOutput of network

(0, 0)(0, 0)(0, 0)00

(1, 1)(0.3, 0.6)(1, 1)00

(0, 1)(0.15, 0.3)(0, 1)0.31

(1, 0)(0.15, 0.3)(0, 1)0.31

It is clear from Table 5.2, that the above perceptron with a hidden layer does compute the XOR function

successfully.

Note:  The activation should exceed the threshold value for a neuron to fire. Where the

output of a neuron is shown to be 0, it is because the internal activation of that neuron fell

short of its threshold value.

C++ Neural Networks and Fuzzy Logic:Preface

A Second Look at the XOR Function: Multilayer Perceptron

84


Example of the Cube Revisited

Let us return to the example of the cube with vertices at the origin O, and the points labeled A, B, C, D, E, F,

and G. Suppose the set of vertices O, A, F, and G give a value of 1 for the function to be evaluated, and the

other vertices give a –1. The two sets are not linearly separable as mentioned before. A simple Perceptron

cannot evaluate this function.

Can the addition of another layer of neurons help? The answer is yes. What would be the role of this

additional layer? The answer is that it will do the final processing for the problem after the previous layer has

done some preprocessing. This can do two separations in the sense that the set of eight vertices can be

separated—or partitioned—into three separable subsets. If this partitioning can also help collect within each

subset, like vertices, meaning those that map onto the same value for the function, the network will succeed in

its task of evaluating the function when the aggregation and thresholding is done at the output neuron.

Strategy

So the strategy is first to consider the set of vertices that give a value of +1 for the function and determine the

minimum number of subsets that can be identified to be each separable from the rest of the vertices. It is

evident that since the vertices O and A lie on one edge of the cube, they can form one subset that is separable.

The other two vertices, viz., F and one for G, which correspond to the value +1 for the function, can form a

second subset that is separable, too. We need not bother with the last four vertices from the point of view of

further partitioning that subset. It is clear that one new layer of three neurons, one of which fires for the inputs

corresponding to the vertices O and A, one for F, and G, and the third for the rest, will then facilitate the

correct evaluation of the function at the output neuron.

Previous Table of Contents Next

Copyright ©

 IDG Books Worldwide, Inc.

C++ Neural Networks and Fuzzy Logic:Preface

Example of the Cube Revisited

85


C++ Neural Networks and Fuzzy Logic

by Valluru B. Rao

MTBooks, IDG Books Worldwide, Inc.



ISBN: 1558515526   Pub Date: 06/01/95

Previous Table of Contents Next



Details

Table 5.3 lists the vertices and their coordinates, together with a flag that indicates to which subset in the

partitioning the vertex belongs. Note that you can think of the action of the Multilayer Perceptron as that of

evaluating the intersection and union of linearly separable subsets.



Table 5.3 Partitioning of Vertices of a Cube

VertexCoordinatesSubset

O(0,0,0)1

A(0,0,1)1

B(0,1,0)2

C(0,1,1)2

D(1,0,0)2

E(1,0,1)2

F(1,1,0)3

G(1,1,1)3

The network, which is a two−layer Perceptron, meaning two layers of weights, has three neurons in the first

layer and one output neuron in the second layer. Remember that we are counting those layers in which the

neurons do the aggregation of the signals coming into them using the connection weights. The first layer with

the three neurons is what is generally described as the hidden layer, since the second layer is not hidden and is

at the extreme right in the layout of the neural network. Table 5.4 gives an example of the weights you can use

for the connections between the input neurons and the hidden layer neurons. There are three input neurons,

one for each coordinate of the vertex of the cube.



Table 5.4 Weights for Connections Between Input Neurons and Hidden Layer Neurons.

Input Neuron #Hidden Layer Neuron #Connection Weight

111


120.1

13−1


211

22−1


23−1

310.2


320.3

330.6


Now we give, in Table 5.5, the weights for the connections between the three hidden−layer neurons and the

output neuron.



Table 5.5 Weights for Connection Between the Hidden−Layer Neurons and the Output Neuron

Hidden Layer Neuron #Connection Weight

C++ Neural Networks and Fuzzy Logic:Preface

Details

86


10.6

30.3


30.6

It is not apparent whether or not these weights will do the job. To determine the activations of the

hidden−layer neurons, you need these weights, and you also need the threshold value at each neuron that does

processing. A hidden−layer neuron will fire, that is, will output a 1, if the weighted sum of the signals it

receives is greater than the threshold value. If the output neuron fires, the function value is taken as +1, and if

it does not fire, the function value is –1. Table 5.6 gives the threshold values. Figure 5.1b shows the neural

network with connection weights and threshold values.

Figure 5.1b

  Neural Network for Cube Example



Table 5.6 Threshold Values

LayerNeuronThreshold Value

hidden11.8

hidden20.05

hidden3−0.2

output10.5

Previous Table of Contents Next

Copyright ©

 IDG Books Worldwide, Inc.

C++ Neural Networks and Fuzzy Logic:Preface

Details


87

C++ Neural Networks and Fuzzy Logic

by Valluru B. Rao

MTBooks, IDG Books Worldwide, Inc.



ISBN: 1558515526   Pub Date: 06/01/95

Previous Table of Contents Next



Performance of the Perceptron

When you input the coordinates of the vertex G, which has 1 for each coordinate, the first hidden−layer

neuron aggregates these inputs and gets a value of 2.2. Since 2.2 is more than the threshold value of the first

neuron in the hidden layer, that neuron fires, and its output of 1 becomes an input to the output neuron on the

connection with weight 0.6. But you need the activations of the other hidden−layer neurons as well. Let us

describe the performance with coordinates of G as the inputs to the network. Table 5.7 describes this.



Table 5.7 Results with Coordinates of Vertex G as Input

Vertex/

CoordinatesHidden LayerWeighted SumCommentActivationContribution to OutputSum

G:1,1,112.2>1.810.6

2−0.8<0.0500

3−1.4<−0.2000.6

The weighted sum at the output neuron is 0.6, and it is greater than the threshold value 0.5. Therefore, the

output neuron fires, and at the vertex G, the function is evaluated to have a value of +1.

Table 5.8 shows the performance of the network with the rest of the vertices of the cube. You will notice that

the network computes a value of +1 at the vertices, O, A, F, and G, and a –1 at the rest.



Table 5.8 Results with Other Inputs

Hidden Layer Neuron#Weighted SumCommentActivationContribution to OutputSum

O :0, 0, 010<1.800

20<0.0500

30>−0.210.60.6



*

A :0, 0, 110.2<1.800

20.3>0.0510.3

30.6>−0.210.60.9



*

B :0, 1, 011<1.800

2−1<0.0500

3−1<−0.2000

C :0, 1, 111.2<1.800

20.2>0.0510.3

3−0.4<−0.2000.3

D :1, 0, 011<1.800

2.1>0.0510.3

3−1<−0.2000.3

E :1, 0, 111.2<1.800

20.4>0.0510.3

3−0.4<−0.2000.3

F :1, 1, 012>1.810.6

C++ Neural Networks and Fuzzy Logic:Preface

Performance of the Perceptron

88


2−0.9<0.0500

3−2<−0.2000.6



*

*

The output neuron fires, as this value is greater than 0.5 (the threshold value); the function value is +1.



Other Two−layer Networks

Many important neural network models have two layers. The Feedforward backpropagation network, in its

simplest form, is one example. Grossberg and Carpenter’s ART1 paradigm uses a two−layer network. The

Counterpropagation network has a Kohonen layer followed by a Grossberg layer. Bidirectional Associative

Memory, (BAM), Boltzman Machine, Fuzzy Associative Memory, and Temporal Associative Memory are

other two−layer networks. For autoassociation, a single−layer network could do the job, but for

heteroassociation or other such mappings, you need at least a two−layer network. We will give more details

on these models shortly.



Many Layer Networks

Kunihiko Fukushima’s Neocognitron, noted for identifying handwritten characters, is an example of a

network with several layers. Some previously mentioned networks can also be multilayered from the addition

of more hidden layers. It is also possible to combine two or more neural networks into one network by

creating appropriate connections between layers of one subnetwork to those of the others. This would

certainly create a multilayer network.



Connections Between Layers

You have already seen some difference in the way connections are made between neurons in a neural

network. In the Hopfield network, every neuron was connected to every other in the one layer that was present

in the network. In the Perceptron, neurons within the same layer were not connected with one another, but the

connections were between the neurons in one layer and those in the next layer. In the former case, the

connections are described as being lateral. In the latter case, the connections are forward and the signals are

fed forward within the network.

Two other possibilities also exist. All the neurons in any layer may have extra connections, with each neuron

connected to itself. The second possibility is that there are connections from the neurons in one layer to the

neurons in a previous layer, in which case there is both forward and backward signal feeding. This occurs, if

feedback is a feature for the network model. The type of layout for the network neurons and the type of

connections between the neurons constitute the architecture of the particular model of the neural network.

Previous Table of Contents Next

Copyright ©

 IDG Books Worldwide, Inc.

C++ Neural Networks and Fuzzy Logic:Preface

Other Two−layer Networks

89


C++ Neural Networks and Fuzzy Logic

by Valluru B. Rao

MTBooks, IDG Books Worldwide, Inc.



ISBN: 1558515526   Pub Date: 06/01/95

Previous Table of Contents Next



Instar and Outstar

Outstar and instar are terms defined by Stephen Grossberg for ways of looking at neurons in a network. A

neuron in a web of other neurons receives a large number of inputs from outside the neuron’s boundaries. This

is like an inwardly radiating star, hence, the term instar. Also, a neuron may be sending its output to many

other destinations in the network. In this way it is acting as an outstar. Every neuron is thus simultaneously

both an instar and an outstar. As an instar it receives stimuli from other parts of the network or from outside

the network. Note that the neurons in the input layer of a network primarily have connections away from them

to the neurons in the next layer, and thus behave mostly as outstars. Neurons in the output layer have many

connections coming to it and thus behave mostly as instars. A neural network performs its work through the

constant interaction of instars and outstars.

A layer of instars can constitute a competitive layer in a network. An outstar can also be described as a source

node with some associated sink nodes that the source feeds to. Grossberg identifies the source input with a

conditioned stimulus and the sink inputs with unconditioned stimuli. Robert Hecht−Nielsen’s

Counterpropagation network is a model built with instars and outstars.

Weights on Connections

Weight assignments on connections between neurons not only indicate the strength of the signal that is being

fed for aggregation but also the type of interaction between the two neurons. The type of interaction is one of

cooperation or of competition. The cooperative type is suggested by a positive weight, and the competition by

a negative weight, on the connection. The positive weight connection is meant for what is called excitation,

while the negative weight connection is termed an inhibition.



Initialization of Weights

Initializing the network weight structure is part of what is called the encoding phase of a network operation.

The encoding algorithms are several, differing by model and by application. You may have gotten the

impression that the weight matrices used in the examples discussed in detail thus far have been arbitrarily

determined; or if there is a method of setting them up, you are not told what it is.

It is possible to start with randomly chosen values for the weights and to let the weights be adjusted

appropriately as the network is run through successive iterations. This would make it easier also. For example,

under supervised training, if the error between the desired and computed output is used as a criterion in

adjusting weights, then one may as well set the initial weights to zero and let the training process take care of

the rest. The small example that follows illustrates this point.



A Small Example

Suppose you have a network with two input neurons and one output neuron, with forward connections

between the input neurons and the output neuron, as shown in Figure 5.2. The network is required to output a

C++ Neural Networks and Fuzzy Logic:Preface

Instar and Outstar

90


1 for the input patterns (1, 0) and (1, 1), and the value 0 for (0, 1) and (0, 0). There are only two connection

weights w



1

 and w



2

.

Figure 5.2

  Neural network with forward connections.

Let us set initially both weights to 0, but you need a threshold function also. Let us use the following

threshold function, which is slightly different from the one used in a previous example:

1 if x > 0

f(x)

=

{



0 if x d 0

The reason for modifying this function is that if f(x) has value 1 when x = 0, then no matter what the weights

are, the output will work out to 1 with input (0, 0). This makes it impossible to get a correct computation of

any function that takes the value 0 for the arguments (0, 0).

Now we need to know by what procedure we adjust the weights. The procedure we would apply for this

example is as follows.



  If the output with input pattern (a, b) is as desired, then do not adjust the weights.

  If the output with input pattern (a, b) is smaller than what it should be, then increment each of w

1

and w



2

 by 1.


  If the output with input pattern (a, b) is greater than what it should be, then subtract 1 from w

1

 if


the product aw

1

 is smaller than 1, and adjust w



2

 similarly.

Table 5.9 shows what takes place when we follow these procedures, and at what values the weights settle.


Download 1.14 Mb.

Do'stlaringiz bilan baham:
1   ...   4   5   6   7   8   9   10   11   ...   41




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling