Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
Table 4.2. Binaural masking conditions and features (constructed from data presented
in [18]). Percentage of words correctly identified Phasic Random Antiphasic stereo phasic 18 27 35 antiphasic 43 27 16 mono right ear 30 13 20 left ear 18 8 15 4.2.13 Binaural masking The human brain can correlate the signals received by two ears to provide a processing gain. This is measured by presenting a signal of noise plus tone to both ears, and adjusting the tone level until it is just masked by the noise. From this situation a change is made, such as inverting the tone phase to one ear, and the tone amplitude is again adjusted until just barely audible. The difference in amplitude of the tone in the second case reveals the processing gain. Table 4.1 (constructed from data presented in [4,9]) summarises the results under different conditions. When the signal is speech, intelligibility is reduced in anti-phasic (uncorrelated noise, out of phase speech) conditions over phasic conditions (uncorrelated noise, speech in phase) as revealed in Table 4.2. The effects of binaural masking are greatest at lower frequencies, but depend upon the frequency distribution of both the test signal and the noise. Note that binaural is a term used by auditory professionals and researchers. It means ‘pertaining to both ears’, and is frequently confused with stereo. Stereo actually refers to sound recorded for multiple loudspeakers, which is designed to be replayed several metres in front of a listener, and generally enjoyed by both ears. Users of headphones are thus exposed to a significantly different sound field to those listening with loudspeakers – a fact which is not normally exploited by audio equipment manufacturers. 70 Hearing 4.2.14 Mistuning of harmonics Two complex musical tones are perceived as separate when they have different fun- damental frequencies, however the hearing process is capable of dealing with slight mistuning of certain components, so almost-equal fundamentals can sometimes be per- ceived as being equal. Again, Matlab can be used to demonstrate this effect. The following section of code creates a complex musical chord, in this case an A 4 , plus the notes a third and an eighth of an octave above this (refer to Infobox 2.5: Musical notes on page 33 for an explanation of the one-twelfth power used below): note=440; t1=tonegen(note, 8000, 1); t2=tonegen(note*2ˆ(3/12), 8000, 1); t3=tonegen(note*2ˆ(8/12), 8000, 1); When replayed, this makes an obvious musical chord: soundsc(t1+t2+t3, 8000); Next, we will mistune the centre note in the chord by a factor of 5% and then listen to the result: m2=tonegen(note*1.05*2ˆ(3/12), 8000, 1); soundsc(t1+m2+t3, 8000); The resulting sound should still be perceived as a fairly pleasant-sounding musical chord, with a different quality to the correctly tuned chord, but musically still compatible, to the relief of piano tuners everywhere. In general, a slight mistuning of one harmonic will result in a perception of reduced amplitude, until the degree of mistuning becomes such that the harmonic is perceived as a tone in its own right. Again, the effects depend on duration, amplitude and absolute frequency (as well as person-to-person differences), but a rule of thumb is that 400 ms long tones must be mistuned by over 3% for them to be heard separately [19]. Note that this is not the same effect at all as the beats frequency caused by two interfering tones. 4.2.15 The precedence effect The Haas or precedence effect ensures that if similar versions of a sound reach an observer at slightly delayed times, then the observer will hear the first signal but suppress the subsequent versions [3]. This effect, only acting on signals reaching the ear within 50 ms of each other, explains why we can still understand speech in an environment containing multiple echoes (such as a small room). The first sound to reach the observer will be heard in preference to further sounds even if the secondary sounds are up to 10 dB louder. Once echoes reach an observer with time delays of more than about 65 ms (corres- ponding to a distance of approximately 20 m in air), they will be perceived as being |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling