Minds and Computers : An Introduction to the Philosophy of Artificial Intelligence

bet	81/94
Sana	01.11.2023
Hajmi	1.05 Mb.
	#1737946

1 ... 77 78 79 80 81 82 83 84 ... 94

Bog'liq
document (2)

Figure 19.2
Computing XOR.

If we want our network to be contextually sensitive, we’re obvi-
ously going to need context at the input layer to be sensitive to. We’re
going to achieve this by organising the input layer into five pools of
nodes. Each pool will contain twenty-seven nodes, representing a full
set of letter detectors – one for each letter of the alphabet plus one for
the space (we’re going to ignore punctuation here to keep things
simple).
These input pools are going to be organised such that each pool is
directed on a di
ﬀerent letter position of the text string input. At each
time step, one pool will be directed at a target letter position. The
other four pools will be organised such that one pool is directed at
each of the two letter positions on either side of the target position.
This will allow the network to make a contextual determination of
which phoneme a particular letter stands for given the surrounding
orthographic context (the two letters either side of the target letter).
At the first time step, the first letter in the text string is placed in the
target position. At each subsequent time step, the text string is
advanced such that the next letter in the string is in the target position.
The output layer of our speech synthesising network will consist of
an output node for each phoneme, so if we are considering Australian
English there will be forty-four output nodes. To keep things simple
here, we’re going to consider just one phoneme whose pronunciation
is invariant across English dialects: /s/ – the word final sound in ‘kiss’
and ‘this’.
We can help ourselves to as many hidden units as we require in
order to match inputs to outputs correctly. We’re going to set thresh-
old values and connection weights such that our network makes
correct determinations concerning whether or not /s/ should be pro-
nounced for the following test set of words: this, gas, wish, shy, kiss,
passive, asia, asian, asiatic, is, as, ice, justice, service.
The first thing to do is to accommodate the standard case. When
we see the letter ‘s’, it is usually the case that the phoneme /s/ should
be produced. So the first thing we’ll do is to connect the ‘s’ detector
in the input pool for the target letter position directly to the output
unit representing the phoneme /s/ such that if the letter ‘s’ is detected
in the target position, the /s/ unit will fire unless otherwise inhibited
(see Figure 19.3).
Our network will now perform correctly with respect to the first
two words in our test set – ‘this’ and ‘gas’. When the ‘s’ in each word
reaches the target position, the /s/ unit will fire, as it should.
There are, however, numerous words in which the letter ‘s’ does not
represent the phoneme /s/. Our nascent speech synthesising network
192
  

will currently make incorrect determinations with respect to the
remainder of the words in our test list. Our next task then is to design
hidden units which detect contexts in which the letter ‘s’ appears but
the phoneme /s/ should not be produced and use these hidden units
to inhibit the activation of the output unit accordingly.
We want the contexts represented in the hidden layer to be as
general as possible so as to accommodate the maximum number of
cases. It is almost always the case in English – with the exception of
some proper names and compound words – that when a letter ‘s’ is
followed by a letter ‘h’ it is not pronounced as /s/. The first hidden unit
we will add to the network will detect just such contexts and inhibit
the output unit (see Figure 19.4).
Now, when our network is presented with either of the next two
words in our test set – ‘wish’ or ‘shy’ – the hidden unit we added will
inhibit the /s/ unit such that it will not fire, as required. As should be
clear, the network now considers any orthographic context in which
‘s’ is followed by ‘h’ to be a context in which /s/ is not produced.
Another context which exhibits similar regularity is one in which
‘s’ is followed by another ‘s’, such as our test words ‘kiss’ and ‘passive’.
In such cases the phoneme /s/ is produced, but only once. As it stands,
our network will determine that /s/ should be pronounced twice as
the output unit will fire when each ‘s’ is in the target position.
  
193

Download 1.05 Mb.

Do'stlaringiz bilan baham:

1 ... 77 78 79 80 81 82 83 84 ... 94