C++ Neural Networks and Fuzzy Logic
C++ Neural Networks and Fuzzy Logic
Download 1.14 Mb. Pdf ko'rish
|
C neural networks and fuzzy logic
- Bu sahifa navigatsiya:
- Input/Output for percept.cpp
- C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN
- Stability and Plasticity
- Plasticity for a Neural Network
- Short−Term Memory and Long−Term Memory
- Layers in a Neural Network
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Comments on Your C++ Program Notice the use of input stream operator cin>> in the C++ program, instead of the C function scanf in several places. The iostream class in C++ was discussed earlier in this chapter. The program works like this: First, the network input neurons are given their connection weights, and then an input vector is presented to the input layer. A threshold value is specified, and the output neuron does the weighted sum of its inputs, which are the outputs of the input layer neurons. This weighted sum is the activation of the output neuron, and it is compared with the threshold value, and the output neuron fires (output is 1) if the threshold value is not greater than its activation. It does not fire (output is 0) if its activation is smaller than the threshold value. In this implementation, neither supervised nor unsupervised training is incorporated.
There are two data files used in this program. One is for setting up the weights, and the other for setting up the input vectors. On the command line, you enter the program name followed by the weight file name and the input file name. For this discussion (also on the accompanying disk for this book) create a file called weight.dat, which contains the following data: 2.0 3.0 3.0 2.0 3.0 0.0 6.0 2.0 These are two weight vectors. Create also an input file called input.dat with the two data vectors below: 1.95 0.27 0.69 1.25 0.30 1.05 0.75 0.19 During the execution of the program, you are first prompted for the number of vectors that are used (in this case, 2), then for a threshold value for the input/weight vectors (use 7.0 in both cases). You will then see the following output. Note that the user input is in italic. percept weight.dat input.dat THIS PROGRAM IS FOR A PERCEPTRON NETWORK WITH AN INPUT LAYER OF 4 NEURONS, EACH CONNECTED TO THE OUTPUT NEURON. THIS EXAMPLE TAKES REAL NUMBERS AS INPUT SIGNALS please enter the number of weights/vectors 2 this is vector # 1 please enter a threshold value, eg 7.0 7.0
weight for neuron 1 is 2 activation is 3.9 weight for neuron 2 is 3 activation is 0.81 C++ Neural Networks and Fuzzy Logic:Preface Comments on Your C++ Program 71
weight for neuron 3 is 3 activation is 2.07 weight for neuron 4 is 2 activation is 2.5 activation is 9.28 the output neuron activation exceeds the threshold value of 7 output value is 1 this is vector # 2 please enter a threshold value, eg 7.0 7.0
weight for neuron 1 is 3 activation is 0.9 weight for neuron 2 is 0 activation is 0 weight for neuron 3 is 6 activation is 4.5 weight for neuron 4 is 2 activation is 0.38 activation is 5.78 the output neuron activation is smaller than the threshold value of 7 output value is 0 Finally, try adding a data vector of (1.4, 0.6, 0.35, 0.99) to the data file. Add a weight vector of ( 2, 6, 8, 3) to the weight file and use a threshold value of 8.25 to see the result. You can use other values to experiment also.
So far, we have considered the construction of two networks, the Hopfield memory and the Perceptron. What are other considerations (which will be discussed in more depth in the chapters to follow) that you should keep in mind ? Some of the considerations that go into the modeling of a neural network for an application are:
fuzzy binary analog crisp analog
fuzzy binary analog crisp binary analog C++ Neural Networks and Fuzzy Logic:Preface Network Modeling 72
number of outputs nature of the application to complete patterns (recognize corrupted patterns) to classify patterns to do an optimization to do approximation to perform data clustering to compute functions
with exemplars without exemplars
with exemplars without exemplars
fixed variable
fixed variable
additive multiplicative
additive and multiplicative combining other approaches genetic algorithms C++ Neural Networks and Fuzzy Logic:Preface Network Modeling 73
Hybrid models, as indicated above, could be of the variety of combining neural network approach with expert system methods or of combining additive and multiplicative processing paradigms. Decision support systems are amenable to approaches that combine neural networks with expert systems. An example of a hybrid model that combines different modes of processing by neurons is the Sigma Pi neural network, wherein one layer of neurons uses summation in aggregation and the next layer of neurons uses multiplicative processing. A hidden layer, if only one, in a neural network is a layer of neurons that operates in between the input layer and the output layer of the network. Neurons in this layer receive inputs from those in the input layer and supply their outputs as the inputs to the neurons in the output layer. When a hidden layer comes in between other hidden layers, it receives input and supplies input to the respective hidden layers. In modeling a network, it is often not easy to determine how many, if any, hidden layers, and of what sizes, are needed in the model. Some approaches, like genetic algorithms—which are paradigms competing with neural network approaches in many situations but nevertheless can be cooperative, as here—are at times used to make a determination on the needed or optimum, as the case may be, numbers of hidden layers and/or the neurons in those hidden layers. In what follows, we outline one such application. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Network Modeling 74
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Tic−Tac−Toe Anyone? David Fogel describes evolutionary general problem solving and uses the familiar game of Tic−Tac−Toe as an example. The idea is to come up with optimal strategies in playing this game. The first player’s marker is an X, and the second player’s marker is an O. Whoever gets three of his or her markers in a row or a column or a diagonal before the other player does, wins. Shrewd players manage a draw position, if their equally shrewd opponent thwarts their attempts to win. A draw position is one where neither player has three of his or her markers in a row, or a column, or a diagonal. The board can be described by a vector of nine components, each of which is a three−valued number. Imagine the squares of the board for the game as taken in sequence row by row from top to bottom. Allow a 1 to show the presence of an X in that square, a 0 to indicate a blank there, and a −1 to correspond to an O. This is an example of a coding for the status of the board. For example, (−1, 0, 1, 0, −1, 0, 1, 1, −1) is a winning position for the second player, because it corresponds to the board looking as below. O X O
X X O A neural network for this problem will have an input layer with nine neurons, as each input pattern has nine components. There would be some hidden layers. But the example is with one hidden layer. The output layer also contains nine neurons, so that one cycle of operation of the network shows what the best configuration of the board is to be, given a particular input. Of course, during this cycle of operation, all that needs to be determined is which blank space, indicated by a 0 in the input, should be changed to 1, if strategy is being worked for player 1. None of the 1’s and −1’s is to be changed. In this particular example, the neural network architecture itself is dynamic. The network expands or contracts according to some rules, which are described next. Fogel describes the network as an evolving network in the sense that the number of neurons in the hidden layer changed with a probability of 0.5. A node was equally likely to be added or deleted. Since the number of unmarked squares dwindles after each play, this kind of approach with varying numbers of neurons in the network seems to be reasonable, and interesting. The initial set of weights is random values between −0.5 and 0.5, inclusive, according to a uniform distribution. Bias and threshold values also come from this distribution. The sigmoid function: 1/(1 + e −x )
Weights and biases were changed during the network operation training cycles. Thus, the network had a learning phase. (You will read more on learning in Chapter 6.) This network is adaptive, since it changes its C++ Neural Networks and Fuzzy Logic:Preface Tic−Tac−Toe Anyone? 75
architecture. Other forms of adaptation in neural networks are in changing parameter values for a fixed architecture. (See Chapter 6.) The results of the experiment Fogel describes show that you need nine neurons in the hidden layer also, for your network to be the best for this problem. They also purged any strategy that was likely to lose. Fogel’s emphasis is on the evolutionary aspect of an adaptive process or experiment. Our interest in this example is primarily due to the fact that an adaptive neural network is used. The choice of Tic−Tac−Toe, while being a simple and all too familiar game, is in the genre of much more complicated games. These games ask a player to place a marker in some position in a given array, and as players take turns doing so, some criterion determines if it is a draw, or who won. Unlike in Tic−Tac−Toe, the criterion by which one wins may not be known to the players. Stability and Plasticity We discuss now a few other considerations in neural network modeling by introducing short−term memory and long−term memory concepts. Neural network training is usually done in an iterative way, meaning that the procedure is repeated a certain number of times. These iterations are referred to as cycles. After each cycle, the input used may remain the same or change, or the weights may remain the same or change. Such change is based on the output of a completed cycle. If the number of cycles is not preset, and the network is allowed to go through cycles until some other criterion is met, the question of whether or not the termination of the iterative process occurs eventually, arises naturally. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Stability and Plasticity 76
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Stability for a Neural Network Stability refers to such convergence that facilitates an end to the iterative process. For example, if any two consecutive cycles result in the same output for the network, then there may be no need to do more iterations. In this case, convergence has occurred, and the network has stabilized in its operation. If weights are being modified after each cycle, then convergence of weights would constitute stability for the network. In some situations, it takes many more iterations than you desire, to have output in two consecutive cycles to be the same. Then a tolerance level on the convergence criterion can be used. With a tolerance level, you accomplish early but satisfactory termination of the operation of the network.
Suppose a network is trained to learn some patterns, and in this process the weights are adjusted according to an algorithm. After learning these patterns and encountering a new pattern, the network may modify the weights in order to learn the new pattern. But what if the new weight structure is not responsive to the new pattern? Then the network does not possess plasticity—the ability to deal satisfactorily with new short−term memory (STM) while retaining long−term memory (LTM). Attempts to endow a network with plasticity may have some adverse effects on the stability of your network.
We alluded to short−term memory (STM) and long−term memory (LTM) in the previous paragraph. STM is basically the information that is currently and perhaps temporarily being processed. It is manifested in the patterns that the network encounters. LTM, on the other hand, is information that is already stored and is not being currently processed. In a neural network, STM is usually characterized by patterns and LTM is characterized by the connections’ weights. The weights determine how an input is processed in the network to yield output. During the cycles of operation of a network, the weights may change. After convergence, they represent LTM, as the weight levels achieved are stable. Summary You saw in this chapter, the C++ implementations of a simple Hopfield network and of a simple Perceptron network. What have not been included in them is an automatic iteration and a learning algorithm. They were not necessary for the examples that were used in this chapter to show C++ implementation, the emphasis was on the method of implementation. In a later chapter, you will read about the learning algorithms and examples of how to implement some of them. Considerations in modeling a neural network are presented in this chapter along with an outline of how Tic−Tac−Toe is used as an example of an adaptive neural network model. C++ Neural Networks and Fuzzy Logic:Preface Stability for a Neural Network 77
You also were introduced to the following concepts: stability, plasticity, short−term memory, and long−term memory (discussed further in later chapters). Much more can be said about them, in terms of the so−called noise−saturation dilemma, or stability–plasticity dilemma and what research has developed to address them (for further reading, see References). Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Stability for a Neural Network 78
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next Chapter 5 A Survey of Neural Network Models Neural Network Models You were introduced in the preceding pages to the Perceptron model, the Feedforward network, and the Hopfield network. You learned that the differences between the models lie in their architecture, encoding, and recall. We aim now to give you a comprehensive picture of these and other neural network models. We will show details and implementations of some networks in later chapters. The models we briefly review in this chapter are the Perceptron, Hopfield, Adaline, Feed−Forward Backpropagation, Bidirectional Associative Memory, Brain−State−in−a−Box, Neocognitron, Fuzzy Associative Memory, ART1, and ART2. C++ implementations of some of these and the role of fuzzy logic in some will be treated in the subsequent chapters. For now, our discussion will be about the distinguishing characteristics of a neural network. We will follow it with the description of some of the models. Layers in a Neural Network A neural network has its neurons divided into subgroups, or fields, and elements in each subgroup are placed in a row, or a column, in the diagram depicting the network. Each subgroup is then referred to as a layer of neurons in the network. A great many models of neural networks have two layers, quite a few have one layer, and some have three or more layers. A number of additional, so−called hidden layers are possible in some networks, such as the Feed−forward backpropagation network. When the network has a single layer, the input signals are received at that layer, processing is done by its neurons, and output is generated at that layer. When more than one layer is present, the first field is for the neurons that supply the input signals for the neurons in the next layer. Every network has a layer of input neurons, but in most of the networks, the sole purpose of these neurons is to feed the input to the next layer of neurons. However, there are feedback connections, or recurrent connections in some networks, so that the neurons in the input layer may also do some processing. In the Hopfield network you saw earlier, the input and output layers are the same. If any layer is present between the input and output layers, it may be referred to as a hidden layer in general, or as a layer with a special name taken after the researcher who proposed its inclusion to achieve certain performance from the network. Examples are the Grossberg and the Kohonen layers. The number of hidden layers is not limited except by the scope of the problem being addressed by the neural network. A layer is also referred to as a field. Then the different layers can be designated as field A, field B, and so on, or shortly, F A , F B . C++ Neural Networks and Fuzzy Logic:Preface Chapter 5 A Survey of Neural Network Models 79
Single−Layer Network A neural network with a single layer is also capable of processing for some important applications, such as integrated circuit implementations or assembly line control. The most common capability of the different models of neural networks is pattern recognition. But one network, called the Brain−State−in−a−Box, which is a single−layer neural network, can do pattern completion. Adaline is a network with A and B fields of neurons, but aggregation or processing of input signals is done only by the field B neurons. The Hopfield network is a single−layer neural network. The Hopfield network makes an association between different patterns (heteroassociation) or associates a pattern with itself (autoassociation). You may characterize this as being able to recognize a given pattern. The idea of viewing it as a case of pattern recognition becomes more relevant if a pattern is presented with some noise, meaning that there is some slight deformation in the pattern, and if the network is able to relate it to the correct pattern. The Perceptron technically has two layers, but has only one group of weights. We therefore still refer to it as a single−layer network. The second layer consists solely of the output neuron, and the first layer consists of the neurons that receive input(s). Also, the neurons in the same layer, the input layer in this case, are not interconnected, that is, no connections are made between two neurons in that same layer. On the other hand, in the Hopfield network, there is no separate output layer, and hence, it is strictly a single−layer network. In addition, the neurons are all fully connected with one another. Let us spend more time on the single−layer Perceptron model and discuss its limitations, and thereby motivate the study of multilayer networks. Previous Table of Contents Next Copyright © IDG Books Worldwide, Inc. C++ Neural Networks and Fuzzy Logic:Preface Single−Layer Network 80
C++ Neural Networks and Fuzzy Logic by Valluru B. Rao MTBooks, IDG Books Worldwide, Inc. ISBN: 1558515526 Pub Date: 06/01/95 Previous Table of Contents Next XOR Function and the Perceptron The ability of a Perceptron in evaluating functions was brought into question when Minsky and Papert proved that a simple function like XOR (the logical function exclusive or) could not be correctly evaluated by a Perceptron. The XOR logical function, f(A,B), is as follows: A B f(A,B)= XOR(A,B) 0 0 0 0 1 1 1 0 1 1 1 0 To summarize the behavior of the XOR, if both inputs are the same value, the output is 0, otherwise the output is 1. Minsky and Papert showed that it is impossible to come up with the proper set of weights for the neurons in the single layer of a simple Perceptron to evaluate the XOR function. The reason for this is that such a Perceptron, one with a single layer of neurons, requires the function to be evaluated, to be linearly separable by means of the function values. The concept of linear separability is explained next. But let us show you first why the simple perceptron fails to compute this function. Since there are two arguments for the XOR function, there would be two neurons in the input layer, and since the function’s value is one number, there would be one output neuron. Therefore, you need two weights w 1 and w 2 ,and a threshold value ¸. Let us now look at the conditions to be satisfied by the w’s and the ¸ so that the outputs corresponding to given inputs would be as for the XOR function. First the output should be 0 if inputs are 0 and 0. The activation works out as 0. To get an output of 0, you need 0 < ¸. This is your first condition. Table 5.1 shows this and two other conditions you need, and why.
Download 1.14 Mb. Do'stlaringiz bilan baham: |
ma'muriyatiga murojaat qiling