Applied Speech and Audio Processing: With matlab examples

bet	68/170
Sana	18.10.2023
Hajmi	2.66 Mb.
	#1708320

1 ... 64 65 66 67 68 69 70 71 ... 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

4.5. Auditory scene analysis
81
Figure 4.8
Illustration of the feature of common fate by comparing three sections of audio tones.
The ﬁrst section reproduces a pleasing note generated from three related sinewaves of different
frequencies. The second section extends each of the three fundamentals of the original sound
with their ﬁrst harmonics. The third section modulates each of the three tones plus their ﬁrst
harmonic. The three modulating frequencies are unrelated.
4.5.3
Common fate
When groups of tones or noises start, stop or ﬂuctuate together, they are more likely to
be interpreted as being either part of a combined sound, or at least having a common
source. As an example, a human voice actually consists of many component tones which
are modulated by lung power, vocal tract shape, lip closure and so on, to create speech.
It is likely that the ability to cluster sounds together by ‘common fate’ is the mechanism
by which we interpret many sounds heard simultaneously as being speech from a single
mouth. In noisy environments, or in the case of multi-speaker babble, common fate may
well be the dominant mechanism by which we can separate out the sounds of particular
individuals who are speaking simultaneously.
The same effect, and outcome, can be heard in an orchestra, where several musical
instruments, even playing the same fundamental note simultaneously, can be interpreted
by the HAS as being played separately (and the more different the instrument is, the
more unique is the modulation, and thus the easier they are to distinguish, for example
a violin and a bassoon are easier to distinguish than two bassoons playing together).
Let us attempt to demonstrate this effect in Matlab by creating three sets of combined
audio notes for playback, as illustrated in Figure 4.8. First of all we will deﬁne three
sinewave frequencies of a, b and c, related by a power of 2
(1/12)
so that they sound
pleasant (see Infobox 2.5 on page 33 which describes the frequency relationship of mu-
sical notes). We will then use the tonegen function of Section 2.7.1 to create three
sinewaves with these frequencies, and also three sinewaves of double the frequency:
dur=1.2;
Fs=8000;
a=220;

82
Hearing
b=a*2ˆ(3/12);
c=a*2ˆ(7/12);
sa=tonegen(a, Fs, dur);
sb=tonegen(b, Fs, dur);
sc=tonegen(c, Fs, dur);
sa2=tonegen(a*2, Fs, dur);
sb2=tonegen(b*2, Fs, dur);
sc2=tonegen(c*2, Fs, dur);
Next, two sounds are deﬁned which in turn mix together these notes:
sound1=sa+sb+sc;
sound2=sound1+sa2+sb2+sc2;
Now, three different modulation patterns are created, again using the tonegen func-
tion, but at much lower frequencies of 7, 27 and 51 Hz, chosen arbitrarily to not be
harmonically related to either the original tones, or to each other. These are then used to
modulate the various original tones:
mod1=tonegen(7, Fs, dur);
mod2=tonegen(27, Fs, dur);
mod3=tonegen(51, Fs, dur);
am=mod1.*(sa+sa2);
bm=mod2.*(sb+sb2);
cm=mod3.*(sc+sc2);
Finally, a short gap is deﬁned for the third sound to accentuate the common starting point
of three sound components, which are combined into sound3, and ﬁnally placed into
a composite vector for replaying:
gap=zeros(1,Fs*0.05);
sound3=[am,gap,gap]+[gap,bm,gap]+[gap,gap,cm];
soundsc([sound1,sound2,sound3], Fs)
A listener exposed to the sounds created should hear, in turn, three segments of audio
lasting approximately 1.2 s each. Firstly a pleasing musical chord consisting of three
harmonics is heard. Then this musical chord is augmented with some harmonically
related higher frequencies. For both of these segments, the perception is of a single
musical chord being produced. However the ﬁnal segment is rather discordant, and
appears to consist of three separate audible components.
Because each of the three harmonic notes from the second segment are now modu-
lated differently, the brain no longer considers them as being from the same source, but
rather from different sources. It thus separates them in its perceptual space.

Download 2.66 Mb.

Do'stlaringiz bilan baham:

1 ... 64 65 66 67 68 69 70 71 ... 170