Minds and Computers : An Introduction to the Philosophy of Artificial Intelligence
Download 1.05 Mb. Pdf ko'rish
|
document (2)
Figure 19.2
Computing XOR. If we want our network to be contextually sensitive, we’re obvi- ously going to need context at the input layer to be sensitive to. We’re going to achieve this by organising the input layer into five pools of nodes. Each pool will contain twenty-seven nodes, representing a full set of letter detectors – one for each letter of the alphabet plus one for the space (we’re going to ignore punctuation here to keep things simple). These input pools are going to be organised such that each pool is directed on a di fferent letter position of the text string input. At each time step, one pool will be directed at a target letter position. The other four pools will be organised such that one pool is directed at each of the two letter positions on either side of the target position. This will allow the network to make a contextual determination of which phoneme a particular letter stands for given the surrounding orthographic context (the two letters either side of the target letter). At the first time step, the first letter in the text string is placed in the target position. At each subsequent time step, the text string is advanced such that the next letter in the string is in the target position. The output layer of our speech synthesising network will consist of an output node for each phoneme, so if we are considering Australian English there will be forty-four output nodes. To keep things simple here, we’re going to consider just one phoneme whose pronunciation is invariant across English dialects: /s/ – the word final sound in ‘kiss’ and ‘this’. We can help ourselves to as many hidden units as we require in order to match inputs to outputs correctly. We’re going to set thresh- old values and connection weights such that our network makes correct determinations concerning whether or not /s/ should be pro- nounced for the following test set of words: this, gas, wish, shy, kiss, passive, asia, asian, asiatic, is, as, ice, justice, service. The first thing to do is to accommodate the standard case. When we see the letter ‘s’, it is usually the case that the phoneme /s/ should be produced. So the first thing we’ll do is to connect the ‘s’ detector in the input pool for the target letter position directly to the output unit representing the phoneme /s/ such that if the letter ‘s’ is detected in the target position, the /s/ unit will fire unless otherwise inhibited (see Figure 19.3). Our network will now perform correctly with respect to the first two words in our test set – ‘this’ and ‘gas’. When the ‘s’ in each word reaches the target position, the /s/ unit will fire, as it should. There are, however, numerous words in which the letter ‘s’ does not represent the phoneme /s/. Our nascent speech synthesising network 192 will currently make incorrect determinations with respect to the remainder of the words in our test list. Our next task then is to design hidden units which detect contexts in which the letter ‘s’ appears but the phoneme /s/ should not be produced and use these hidden units to inhibit the activation of the output unit accordingly. We want the contexts represented in the hidden layer to be as general as possible so as to accommodate the maximum number of cases. It is almost always the case in English – with the exception of some proper names and compound words – that when a letter ‘s’ is followed by a letter ‘h’ it is not pronounced as /s/. The first hidden unit we will add to the network will detect just such contexts and inhibit the output unit (see Figure 19.4). Now, when our network is presented with either of the next two words in our test set – ‘wish’ or ‘shy’ – the hidden unit we added will inhibit the /s/ unit such that it will not fire, as required. As should be clear, the network now considers any orthographic context in which ‘s’ is followed by ‘h’ to be a context in which /s/ is not produced. Another context which exhibits similar regularity is one in which ‘s’ is followed by another ‘s’, such as our test words ‘kiss’ and ‘passive’. In such cases the phoneme /s/ is produced, but only once. As it stands, our network will determine that /s/ should be pronounced twice as the output unit will fire when each ‘s’ is in the target position. 193 Download 1.05 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling