C++ Neural Networks and Fuzzy Logic
C++ Neural Networks and Fuzzy Logic
Download 1.14 Mb. Pdf ko'rish
|
C neural networks and fuzzy logic
- Bu sahifa navigatsiya:
- Another Example of Backpropagation Calculations
- Table 7.1
- C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN
- C++ Implementation of a Backpropagation Simulator
- Training
- output.dat. Nontraining Mode (Test Mode)
- weights.dat
- Test
- define
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Adjustments to Threshold Values or Biases The bias or the threshold value we added to the activation, before applying the threshold function to get the output of a neuron, will also be adjusted based on the error being propagated back. The needed values for this are in the previous discussion. The adjustment for the threshold value of a neuron in the output layer is obtained by multiplying the calculated error (not just the difference) in the output at the output neuron and the learning rate parameter used in the adjustment calculation for weights at this layer. In our previous example, we have the learning rate parameter as 0.2, and the error vector as (–0.02, –0.04, 0.04, 0.11), so the adjustments to the threshold values of the four output neurons are given by the vector (–0.004, –0.008, 0.008, 0.022). These adjustments are added to the current levels of threshold values at the output neurons. The adjustment to the threshold value of a neuron in the hidden layer is obtained similarly by multiplying the learning rate with the computed error in the output of the hidden layer neuron. Therefore, for the second neuron in the hidden layer, the adjustment to its threshold value is calculated as 0.15 * –0.0041, which is –0.0006. Add this to the current threshold value of 0.679 to get 0.6784, which is to be used for this neuron in the next training pattern for the neural network.
You have seen, in the preceding sections, the details of calculations for one particular neuron in the hidden layer in a feedforward backpropagation network with five input neurons and four neurons in the output layer, and two neurons in the hidden layer. You are going to see all the calculations in the C++ implementation later in this chapter. Right now, though, we present another example and give the complete picture of the calculations done in one completed iteration or cycle of backpropagation. Consider a feedforward backpropagation network with three input neurons, two neurons in the hidden layer, and three output neurons. The weights on connections from the input neurons to the neurons in the hidden layer are given in Matrix M−1, and those from the neurons in the hidden layer to output neurons are given in Matrix M−2. We calculate the output of each neuron in the hidden and output layers as follows. We add a bias or threshold value to the activation of a neuron (call this result x) and use the sigmoid function below to get the output. f(x) = 1/ (1 + e −x )
and 0.15 for the connections between the input neurons and the neurons in the hidden layer. These values as you recall are the same as in the previous illustration, to make it easy for you to follow the calculations by C++ Neural Networks and Fuzzy Logic:Preface Adjustments to Threshold Values or Biases 116
comparing them with similar calculations in the preceding sections. The input pattern is ( 0.52, 0.75, 0.97 ), and the desired output pattern is ( 0.24, 0.17, 0.65). The initial weight matrices are as follows:
0.6 − 0.4 0.2 0.8 − 0.5 0.3 M−2 Matrix of weights from hidden layer to output layer −0.90 0.43 0.25 0.11 − 0.67 − 0.75 The threshold values (or bias) for neurons in the hidden layer are 0.2 and 0.3, while those for the output neurons are 0.15, 0.25, and 0.05, respectively. Table 7.1 presents all the results of calculations done in the first iteration. You will see modified or new weight matrices and threshold values. You will use these and the original input vector and the desired output vector to carry out the next iteration. Table 7.1 Backpropagation Calculations ItemI−1I−2I−3H−1H−2O−1O−2O−3 Input0.520.750.97 Desired Output0.240.170.65 M−1 Row 10.6− 0.4 M−1 Row 20.20.8 M−1 Row 3− 0.50.3 M−2 Row 1− 0.900.430.25 M−2 Row 20.11− 0.67− 0.75 Threshold0.20.30.150.250.05 Activation − H− 0.0230.683 Activation + Threshold −H0.1770.983 Output −H0.5440.728 Complement0.4560.272 Activation −O− 0.410− 0.254− 0.410 Activation + Threshold −O− 0.260− 0.004− 0.360 Output −O0.4350.4990.411 Complement0.5650.5010.589 Diff. from Target− 0.195− 0.3290.239 Computed Error −O− 0.048− 0.0820.058 Computed Error −H0.00560.0012 Adjustment to Threshold0.00080.0002−0.0096−0.01640.0116 Adjustment to M−2 Column 1−0.0005−0.0070 Adjustment to M−2 Column 20.00070.0008 Adjustment to M−2 Column 30.00080.0011 New Matrix M−2 Row 1− 0.910.4120.262 New Matrix M−2 Row 20.096− 0.694− 0.734 New Threshold Values −O0.14040.23360.0616 Adjustment to M−1 Row 10.0004−0.0001 C++ Neural Networks and Fuzzy Logic:Preface Adjustments to Threshold Values or Biases 117
Adjustment to M−1 Row 20.00060.0001 Adjustment to M−1 Row 30.00080.0002 New Matrix M−1 Row 10.6004− 0.4 New Matrix M−1 Row 20.20060.8001 New Matrix M−1 Row 3−0.49920.3002 New Threshold Values −H0.20080.3002 Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Adjustments to Threshold Values or Biases 118
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next The top row in the table gives headings for the columns. They are, Item, I−1, I−2, I−3 (I−k being for input layer neuron k); H−1, H−2 (for hidden layer neurons); and O−1, O−2, O−3 (for output layer neurons). In the first column of the table, M−1 and M−2 refer to weight matrices as above. Where an entry is appended with −H, like in Output −H, the information refers to the hidden layer. Similarly, −O refers to the output layer, as in Activation + threshold −O. The next iteration uses the following information from the previous iteration, which you can identify from Table 7.1. The input pattern is ( 0.52, 0.75, 0.97 ), and the desired output pattern is ( 0.24, 0.17, 0.65). The current weight matrices are as follows: M−1 Matrix of weights from input layer to hidden layer: 0.6004 − 0.4 0.2006 0.8001 − 0.4992 0.3002 M−2 Matrix of weights from hidden layer to output layer: −0.910 0.412 0.262 0.096 −0.694 −0.734 The threshold values (or bias) for neurons in the hidden layer are 0.2008 and 0.3002, while those for the output neurons are 0.1404, 0.2336, and 0.0616, respectively. You can keep the learning parameters as 0.15 for connections between input and hidden layer neurons, and 0.2 for connections between the hidden layer neurons and output neurons, or you can slightly modify them. Whether or not to change these two parameters is a decision that can be made perhaps at a later iteration, having obtained a sense of how the process is converging. If you are satisfied with the rate at which the computed output pattern is getting close to the target output pattern, you would not change these learning rates. If you feel the convergence is much slower than you would like, then the learning rate parameters can be adjusted slightly upwards. It is a subjective decision both in terms of when (if at all) and to what new levels these parameters need to be revised.
You have just seen an example of the process of training in the feedforward backpropagation network, described in relation to one hidden layer neuron and one input neuron. There were a few vectors that were shown and used, but perhaps not made easily identifiable. We therefore introduce some notation and describe the equations that were implicitly used in the example. C++ Neural Networks and Fuzzy Logic:Preface Notation and Equations 119
Notation Let us talk about two matrices whose elements are the weights on connections. One matrix refers to the interface between the input and hidden layers, and the second refers to that between the hidden layer and the output layer. Since connections exist from each neuron in one layer to every neuron in the next layer, there is a vector of weights on the connections going out from any one neuron. Putting this vector into a row of the matrix, we get as many rows as there are neurons from which connections are established. Let M 1
2 be these matrices of weights. Then what does M 1 [i][j] represent? It is the weight on the connection from the ith input neuron to the jth neuron in the hidden layer. Similarly, M 2 [i][j] denotes the weight on the connection from the ith neuron in the hidden layer and the jth output neuron. Next, we will use x, y, z for the outputs of neurons in the input layer, hidden layer, and output layer, respectively, with a subscript attached to denote which neuron in a given layer we are referring to. Let P denote the desired output pattern, with p i as the components. Let m be the number of input neurons, so that according to our notation, (x1, x2, …, xm) will denote the input pattern. If P has, say, r components, the output layer needs r neurons. Let the number of hidden layer neurons be n. Let ² h be the learning rate parameter for the hidden layer, and ² o2 , that for the output layer. Let ¸ with the appropriate subscript represent the threshold value or bias for a hidden layer neuron, and Ä with an appropriate subscript refer to the threshold value of an output neuron. Let the errors in output at the output layer be denoted by ejs and those at the hidden layer by t i ’s. If we use a ” prefix of any parameter, then we are looking at the change in or adjustment to that parameter. Also, the thresholding function we would use is the sigmoid function, f(x) = 1 / (1 + exp(–x)). Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Notation 120
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Equations Output of jth hidden layer neuron: y j
i x i M 1 [ i ][ j ] ) + ¸j ) (7.1) Output of jth output layer neuron: z j
i y i M 2 [ i ][ j ] ) + Ä j ) (7.2) Ith component of vector of output differences: desired value − computed value = P i – z
i Ith component of output error at the output layer: e i
i − z
i ) (7.3) Ith component of output error at the hidden layer: t i = y i (1 − y i ) (£j M
2 [ i ][ j ] e j )
Adjustment for weight between ith neuron in hidden layer and jth output neuron: ”M 2 [ i ][ j ] = ² o y i e j (7.5) Adjustment for weight between ith input neuron and jth neuron in hidden layer: M 1
h x i t j (7.6) C++ Neural Networks and Fuzzy Logic:Preface Equations 121
Adjustment to the threshold value or bias for the jth output neuron: ” ¸
j = ²
o e j Adjustment to the threshold value or bias for the jth hidden layer neuron: ´¸ j = ² h e j For use of momentum parameter ± (more on this parameter in Chapter 13), instead of equations 7.5 and 7.6, use: ”M
[ i ][ j ] ( t ) = ² o y i e j + ±”M 2 [ i ][ j ] ( t − 1 ) (7.7) and
”M 1 [ i ][ j ] ( t ) = ² h x i t j + ±”M 1 [ i ][ j ] (t − 1) (7.8)
The backpropagation simulator of this chapter has the following design objectives: 1. Allow the user to specify the number and size of all layers. 2. Allow the use of one or more hidden layers. 3. Be able to save and restore the state of the network. 4. Run from an arbitrarily large training data set or test data set. 5. Query the user for key network and simulation parameters. 6. Display key information at the end of the simulation. 7. Demonstrate the use of some C++ features. A Brief Tour of How to Use the Simulator In order to understand the C++ code, let us have an overview of the functioning of the program. There are two modes of operation in the simulator. The user is queried first for which mode of operation is desired. The modes are Training mode and Nontraining mode (Test mode). Training Mode Here, the user provides a training file in the current directory called training.dat. This file contains exemplar pairs, or patterns. Each pattern has a set of inputs followed by a set of outputs. Each value is separated by one or more spaces. As a convention, you can use a few extra spaces to separate the inputs from the outputs. Here is an example of a training.dat file that contains two patterns: 0.4 0.5 0.89 −0.4 −0.8 0.23 0.8 −0.3 0.6 0.34 C++ Neural Networks and Fuzzy Logic:Preface C++ Implementation of a Backpropagation Simulator 122
In this example, the first pattern has inputs 0.4, 0.5, and 0.89, with an expected output of –0.4 and –0.8. The second pattern has inputs of 0.23, 0.8, and –0.3 and outputs of 0.6 and 0.34. Since there are three inputs and two outputs, the input layer size for the network must be three neurons and the output layer size must be two neurons. Another file that is used in training is the weights file. Once the simulator reaches the error tolerance that was specified by the user, or the maximum number of iterations, the simulator saves the state of the network, by saving all of its weights in a file called weights.dat. This file can then be used subsequently in another run of the simulator in Nontraining mode. To provide some idea of how the network has done, information about the total and average error is presented at the end of the simulation. In addition, the output generated by the network for the last pattern vector is provided in an output file called output.dat.
In this mode, the user provides test data to the simulator in a file called test.dat. This file contains only input patterns. When this file is applied to an already trained network, an output.dat file is generated, which contains the outputs from the network for all of the input patterns. The network goes through one cycle of operation in this mode, covering all the patterns in the test data file. To start up the network, the weights file, weights.dat is read to initialize the state of the network. The user must provide the same network size parameters used to train the network. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface C++ Implementation of a Backpropagation Simulator 123
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Operation The first thing to do with your simulator is to train a network with an architecture you choose. You can select the number of layers and the number of hidden layers for your network. Keep in mind that the input and output layer sizes are dictated by the input patterns you are presenting to the network and the outputs you seek from the network. Once you decide on an architecture, perhaps a simple three−layer network with one hidden layer, you prepare training data for it and save the data in the training.dat file. After this you are ready to train. You provide the simulator with the following information:
three hidden layers) • The size for each layer, from the input to the output The simulator then begins training and reports the current cycle number and the average error for each cycle. You should watch the error to see that it is on the whole decreasing with time. If it is not, you should restart the simulation, because this will start with a brand new set of random weights and give you another, possibly better, solution. Note that there will be legitimate periods where the error may increase for some time. Once the simulation is done you will see information about the number of cycles and patterns used, and the total and average error that resulted. The weights are saved in the weights.dat file. You can rename this file to use this particular state of the network later. You can infer the size and number of layers from the information in this file, as will be shown in the next section for the weights.dat file format. You can have a peek at the
pattern and the match to that pattern, copy the training file to the test file and delete the output information from it. You can then run Test mode to get a full list of all the input stimuli and responses in the output.dat file.
Summary of Files Used in the Backpropagation Simulator Here is a list of the files for your reference, as well as what they are used for. • weights.dat You can look at this file to see the weights for the network. It shows the layer number followed by the weights that feed into the layer. The first layer, or input layer, layer zero, does not have any weights associated with it. An example of the weights.dat file is shown as follows for a network with three layers of sizes 3, 5, and 2. Note that the row width for layer n matches the column length for layer n + 1: 1 −0.199660 −0.859660 −0.339660 −0.25966 0.520340 1 0.292860 −0.487140 0.212860 −0.967140 −0.427140 1 0.542106 −0.177894 0.322106 −0.977894 0.562106 2 −0.175350 −0.835350 2 −0.330167 −0.250167 C++ Neural Networks and Fuzzy Logic:Preface C++ Implementation of a Backpropagation Simulator 124
2 0.503317 0.283317 2 −0.477158 0.222842 2 −0.928322 −0.388322 In this weights file the row width for layer 1 is 5, corresponding to the output of that (middle) layer. The input for the layer is the column length, which is 3, just as specified. For layer 2, the output size is the row width, which is 2, and the input size is the column length, 5, which is the same as the output for the middle layer. You can read the weights file to find out how things look.
you’d like without degrading the performance of the simulator. The simulator caches data in memory for processing. This is to improve the speed of the simulation since disk accesses are expensive in time. A data buffer, which has a maximum size specified in a #define statement in the program, is filled with data from the training.dat file whenever data is needed. The format for the training.dat file has been shown in the Training mode section. • test.dat The test.dat file is just like the training.dat file but without expected outputs. You use this file with a trained neural network in Test mode to see what responses you get for untrained data. • output.dat The output.dat file contains the results of the simulation. In Test mode, the input and output vectors are shown for all pattern vectors. In the Simulator mode, the expected output is also shown, but only the last vector in the training set is presented, since the training set is usually quite large.
Shown here is an example of an output file in Training mode: for input vector: 0.400000 −0.400000 output vector is: 0.880095 expected output vector is: 0.900000 Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface C++ Implementation of a Backpropagation Simulator 125
|
ma'muriyatiga murojaat qiling