C++ Neural Networks and Fuzzy Logic
Table 5.1 Conditions on Weights InputActivationOutputNeeded Condition
Download 1.14 Mb. Pdf ko'rish
|
C neural networks and fuzzy logic
- Bu sahifa navigatsiya:
- C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN
- Figure 5.1a
- Example of the Cube Revisited
- Table 5.3
- Table 5.4
- Table 5.5
- Figure 5.1b
- Table 5.7
- Table 5.8
- Connections Between Layers
- Initialization of Weights
Table 5.1 Conditions on Weights InputActivationOutputNeeded Condition 0, 0000 < ¸ 1, 0w 1
1 > ¸
0, 1w 2 1w 2 > ¸
1, 1w 1 + w 2 0w 1 + w 2
From the first three conditions, you can deduce that the sum of the two weights has to be greater than ¸, which has to be positive itself. Line 4 is inconsistent with lines 1, 2, and 3, since line 4 requires the sum of the two weights to be less than ¸. This affirms the contention that it is not possible to compute the XOR function with C++ Neural Networks and Fuzzy Logic:Preface XOR Function and the Perceptron 81
a simple perceptron. Geometrically, the reason for this failure is that the inputs (0, 1) and (1, 0) with which you want output 1, are situated diagonally opposite each other, when plotted as points in the plane, as shown below in a diagram of the output (1=T, 0=F): F T T F
You can’t separate the T’s and the F’s with a straight line. This means that you cannot draw a line in the plane in such a way that neither (1, 1) −>F nor (0, 0)−>F is on the same side of the line as (0, 1) −>T and (1, 0)−> T. Linear Separability What linearly separable means is, that a type of a linear barrier or a separator—a line in the plane, or a plane in the three−dimensional space, or a hyperplane in higher dimensions—should exist, so that the set of inputs that give rise to one value for the function all lie on one side of this barrier, while on the other side lie the inputs that do not yield that value for the function. A hyperplane is a surface in a higher dimension, but with a linear equation defining it much the same way a line in the plane and a plane in the three−dimensional space are defined. To make the concept a little bit clearer, consider a problem that is similar but, let us emphasize, not the same as the XOR problem. Imagine a cube of 1−unit length for each of its edges and lying in the positive octant in a xyz−rectangular coordinate system with one corner at the origin. The other corners or vertices are at points with coordinates (0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), and (1, 1, 1). Call the origin O, and the seven points listed as A, B, C, D, E, F, and G, respectively. Then any two faces opposite to each other are linearly separable because you can define the separating plane as the plane halfway between these two faces and also parallel to these two faces. For example, consider the faces defined by the set of points O, A, B, and C and by the set of points D, E, F, and G. They are parallel and 1 unit apart, as you can see in Figure 5.1. The separating plane for these two faces can be seen to be one of many possible planes—any plane in between them and parallel to them. One example, for simplicity, is the plane that passes through the points (1/2, 0, 0), (1/2, 0, 1), (1/2, 1, 0), and (1/2, 1, 1). Of course, you need only specify three of those four points because a plane is uniquely determined by three points that are not all on the same line. So if the first set of points corresponds to a value of say, +1 for the function, and the second set to a value of –1, then a single−layer Perceptron can determine, through some training algorithm, the correct weights for the connections, even if you start with the weights being initially all 0.
Separating plane. Consider the set of points O, A, F, and G. This set of points cannot be linearly separated from the other vertices of the cube. In this case, it would be impossible for the single−layer Perceptron to determine the C++ Neural Networks and Fuzzy Logic:Preface Linear Separability 82
proper weights for the neurons in evaluating the type of function we have been discussing. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Linear Separability 83
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next A Second Look at the XOR Function: Multilayer Perceptron By introducing a set of cascaded Perceptrons, you have a Perceptron network, with an input layer, middle or hidden layer, and an output layer. You will see that the multilayer Perceptron can evaluate the XOR function as well as other logic functions (AND, OR, MAJORITY, etc.). The absence of the separability that we talked about earlier is overcome by having a second stage, so to speak, of connection weights. You need two neurons in the input layer and one in the output layer. Let us put a hidden layer with two neurons. Let w 11 , w 12 , w 21 , and w 22 , be the weights on connections from the input neurons to the hidden layer neurons. Let v 1 , v 2 , be the weights on the connections from the hidden layer neurons to the outout neuron. We will select the w’s (weights) and the threshold values ¸ 1 , and ¸ 2 at the hidden layer neurons, so that the input (0, 0) generates the output vector (0, 0), and the input vector (1, 1) generates (1, 1), while the inputs (1, 0) and (0, 1) generate (0, 1) as the hidden layer output. The inputs to the output layer neurons would be from the set {(0, 0), (1, 1), (0, 1)}. These three vectors are separable, with (0, 0), and (1, 1) on one side of the separating line, while (0, 1) is on the other side. We will select the s (weights) and Ä, the threshold value at the output neuron, so as to make the inputs (0, 0) and (1, 1) cause an output of 0 for the network, and an output of 1 is caused by the input (0, 1). The network layout within the labels of weights and threshold values inside the nodes representing hidden layer and output neurons is shown in Figure 5.1a. Table 5.2 gives the results of operation of this network. Figure 5.1a Example network. Table 5.2 Results for the Perceptron with One Hidden Layer. InputHidden Layer ActivationsHidden Layer OutputsOutput Neuron activatonOutput of network (0, 0)(0, 0)(0, 0)00 (1, 1)(0.3, 0.6)(1, 1)00 (0, 1)(0.15, 0.3)(0, 1)0.31 (1, 0)(0.15, 0.3)(0, 1)0.31 It is clear from Table 5.2, that the above perceptron with a hidden layer does compute the XOR function successfully.
output of a neuron is shown to be 0, it is because the internal activation of that neuron fell short of its threshold value. C++ Neural Networks and Fuzzy Logic:Preface A Second Look at the XOR Function: Multilayer Perceptron 84
Example of the Cube Revisited Let us return to the example of the cube with vertices at the origin O, and the points labeled A, B, C, D, E, F, and G. Suppose the set of vertices O, A, F, and G give a value of 1 for the function to be evaluated, and the other vertices give a –1. The two sets are not linearly separable as mentioned before. A simple Perceptron cannot evaluate this function. Can the addition of another layer of neurons help? The answer is yes. What would be the role of this additional layer? The answer is that it will do the final processing for the problem after the previous layer has done some preprocessing. This can do two separations in the sense that the set of eight vertices can be separated—or partitioned—into three separable subsets. If this partitioning can also help collect within each subset, like vertices, meaning those that map onto the same value for the function, the network will succeed in its task of evaluating the function when the aggregation and thresholding is done at the output neuron.
So the strategy is first to consider the set of vertices that give a value of +1 for the function and determine the minimum number of subsets that can be identified to be each separable from the rest of the vertices. It is evident that since the vertices O and A lie on one edge of the cube, they can form one subset that is separable. The other two vertices, viz., F and one for G, which correspond to the value +1 for the function, can form a second subset that is separable, too. We need not bother with the last four vertices from the point of view of further partitioning that subset. It is clear that one new layer of three neurons, one of which fires for the inputs corresponding to the vertices O and A, one for F, and G, and the third for the rest, will then facilitate the correct evaluation of the function at the output neuron. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Example of the Cube Revisited 85
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Details Table 5.3 lists the vertices and their coordinates, together with a flag that indicates to which subset in the partitioning the vertex belongs. Note that you can think of the action of the Multilayer Perceptron as that of evaluating the intersection and union of linearly separable subsets. Table 5.3 Partitioning of Vertices of a Cube VertexCoordinatesSubset O(0,0,0)1 A(0,0,1)1 B(0,1,0)2 C(0,1,1)2 D(1,0,0)2 E(1,0,1)2 F(1,1,0)3 G(1,1,1)3 The network, which is a two−layer Perceptron, meaning two layers of weights, has three neurons in the first layer and one output neuron in the second layer. Remember that we are counting those layers in which the neurons do the aggregation of the signals coming into them using the connection weights. The first layer with the three neurons is what is generally described as the hidden layer, since the second layer is not hidden and is at the extreme right in the layout of the neural network. Table 5.4 gives an example of the weights you can use for the connections between the input neurons and the hidden layer neurons. There are three input neurons, one for each coordinate of the vertex of the cube. Table 5.4 Weights for Connections Between Input Neurons and Hidden Layer Neurons. Input Neuron #Hidden Layer Neuron #Connection Weight 111
120.1 13−1
211 22−1
23−1 310.2
320.3 330.6
Now we give, in Table 5.5, the weights for the connections between the three hidden−layer neurons and the output neuron. Table 5.5 Weights for Connection Between the Hidden−Layer Neurons and the Output Neuron Hidden Layer Neuron #Connection Weight C++ Neural Networks and Fuzzy Logic:Preface Details 86
10.6 30.3
30.6 It is not apparent whether or not these weights will do the job. To determine the activations of the hidden−layer neurons, you need these weights, and you also need the threshold value at each neuron that does processing. A hidden−layer neuron will fire, that is, will output a 1, if the weighted sum of the signals it receives is greater than the threshold value. If the output neuron fires, the function value is taken as +1, and if it does not fire, the function value is –1. Table 5.6 gives the threshold values. Figure 5.1b shows the neural network with connection weights and threshold values.
Neural Network for Cube Example Table 5.6 Threshold Values LayerNeuronThreshold Value hidden11.8 hidden20.05 hidden3−0.2 output10.5 Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Details
87 C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Performance of the Perceptron When you input the coordinates of the vertex G, which has 1 for each coordinate, the first hidden−layer neuron aggregates these inputs and gets a value of 2.2. Since 2.2 is more than the threshold value of the first neuron in the hidden layer, that neuron fires, and its output of 1 becomes an input to the output neuron on the connection with weight 0.6. But you need the activations of the other hidden−layer neurons as well. Let us describe the performance with coordinates of G as the inputs to the network. Table 5.7 describes this. Table 5.7 Results with Coordinates of Vertex G as Input Vertex/ CoordinatesHidden LayerWeighted SumCommentActivationContribution to OutputSum G:1,1,112.2>1.810.6 2−0.8<0.0500 3−1.4<−0.2000.6 The weighted sum at the output neuron is 0.6, and it is greater than the threshold value 0.5. Therefore, the output neuron fires, and at the vertex G, the function is evaluated to have a value of +1. Table 5.8 shows the performance of the network with the rest of the vertices of the cube. You will notice that the network computes a value of +1 at the vertices, O, A, F, and G, and a –1 at the rest. Table 5.8 Results with Other Inputs Hidden Layer Neuron#Weighted SumCommentActivationContribution to OutputSum O :0, 0, 010<1.800 20<0.0500 30>−0.210.60.6 * A :0, 0, 110.2<1.800 20.3>0.0510.3 30.6>−0.210.60.9 * B :0, 1, 011<1.800 2−1<0.0500 3−1<−0.2000 C :0, 1, 111.2<1.800 20.2>0.0510.3 3−0.4<−0.2000.3 D :1, 0, 011<1.800 2.1>0.0510.3 3−1<−0.2000.3 E :1, 0, 111.2<1.800 20.4>0.0510.3 3−0.4<−0.2000.3 F :1, 1, 012>1.810.6 C++ Neural Networks and Fuzzy Logic:Preface Performance of the Perceptron 88
2−0.9<0.0500 3−2<−0.2000.6 * * The output neuron fires, as this value is greater than 0.5 (the threshold value); the function value is +1. Other Two−layer Networks Many important neural network models have two layers. The Feedforward backpropagation network, in its simplest form, is one example. Grossberg and Carpenter’s ART1 paradigm uses a two−layer network. The Counterpropagation network has a Kohonen layer followed by a Grossberg layer. Bidirectional Associative Memory, (BAM), Boltzman Machine, Fuzzy Associative Memory, and Temporal Associative Memory are other two−layer networks. For autoassociation, a single−layer network could do the job, but for heteroassociation or other such mappings, you need at least a two−layer network. We will give more details on these models shortly. Many Layer Networks Kunihiko Fukushima’s Neocognitron, noted for identifying handwritten characters, is an example of a network with several layers. Some previously mentioned networks can also be multilayered from the addition of more hidden layers. It is also possible to combine two or more neural networks into one network by creating appropriate connections between layers of one subnetwork to those of the others. This would certainly create a multilayer network. Connections Between Layers You have already seen some difference in the way connections are made between neurons in a neural network. In the Hopfield network, every neuron was connected to every other in the one layer that was present in the network. In the Perceptron, neurons within the same layer were not connected with one another, but the connections were between the neurons in one layer and those in the next layer. In the former case, the connections are described as being lateral. In the latter case, the connections are forward and the signals are fed forward within the network. Two other possibilities also exist. All the neurons in any layer may have extra connections, with each neuron connected to itself. The second possibility is that there are connections from the neurons in one layer to the neurons in a previous layer, in which case there is both forward and backward signal feeding. This occurs, if feedback is a feature for the network model. The type of layout for the network neurons and the type of connections between the neurons constitute the architecture of the particular model of the neural network. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Other Two−layer Networks 89
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Instar and Outstar Outstar and instar are terms defined by Stephen Grossberg for ways of looking at neurons in a network. A neuron in a web of other neurons receives a large number of inputs from outside the neuron’s boundaries. This is like an inwardly radiating star, hence, the term instar. Also, a neuron may be sending its output to many other destinations in the network. In this way it is acting as an outstar. Every neuron is thus simultaneously both an instar and an outstar. As an instar it receives stimuli from other parts of the network or from outside the network. Note that the neurons in the input layer of a network primarily have connections away from them to the neurons in the next layer, and thus behave mostly as outstars. Neurons in the output layer have many connections coming to it and thus behave mostly as instars. A neural network performs its work through the constant interaction of instars and outstars. A layer of instars can constitute a competitive layer in a network. An outstar can also be described as a source node with some associated sink nodes that the source feeds to. Grossberg identifies the source input with a conditioned stimulus and the sink inputs with unconditioned stimuli. Robert Hecht−Nielsen’s Counterpropagation network is a model built with instars and outstars.
Weight assignments on connections between neurons not only indicate the strength of the signal that is being fed for aggregation but also the type of interaction between the two neurons. The type of interaction is one of cooperation or of competition. The cooperative type is suggested by a positive weight, and the competition by a negative weight, on the connection. The positive weight connection is meant for what is called excitation, while the negative weight connection is termed an inhibition. Initialization of Weights Initializing the network weight structure is part of what is called the encoding phase of a network operation. The encoding algorithms are several, differing by model and by application. You may have gotten the impression that the weight matrices used in the examples discussed in detail thus far have been arbitrarily determined; or if there is a method of setting them up, you are not told what it is. It is possible to start with randomly chosen values for the weights and to let the weights be adjusted appropriately as the network is run through successive iterations. This would make it easier also. For example, under supervised training, if the error between the desired and computed output is used as a criterion in adjusting weights, then one may as well set the initial weights to zero and let the training process take care of the rest. The small example that follows illustrates this point. A Small Example Suppose you have a network with two input neurons and one output neuron, with forward connections between the input neurons and the output neuron, as shown in Figure 5.2. The network is required to output a C++ Neural Networks and Fuzzy Logic:Preface Instar and Outstar 90
1 for the input patterns (1, 0) and (1, 1), and the value 0 for (0, 1) and (0, 0). There are only two connection weights w 1 and w 2 .
Neural network with forward connections. Let us set initially both weights to 0, but you need a threshold function also. Let us use the following threshold function, which is slightly different from the one used in a previous example: 1 if x > 0 f(x) =
0 if x d 0 The reason for modifying this function is that if f(x) has value 1 when x = 0, then no matter what the weights are, the output will work out to 1 with input (0, 0). This makes it impossible to get a correct computation of any function that takes the value 0 for the arguments (0, 0). Now we need to know by what procedure we adjust the weights. The procedure we would apply for this example is as follows. • If the output with input pattern (a, b) is as desired, then do not adjust the weights. • If the output with input pattern (a, b) is smaller than what it should be, then increment each of w 1 and w 2 by 1.
• If the output with input pattern (a, b) is greater than what it should be, then subtract 1 from w 1 if
the product aw 1 is smaller than 1, and adjust w 2 similarly. Table 5.9 shows what takes place when we follow these procedures, and at what values the weights settle.
Download 1.14 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling