Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
4.2. Psychoacoustics
63 4.2.2 Cochlea echoes When the ear is stimulated with tones, active processing produces components with differing frequencies and amplitudes from the stimulating tones [4]. For example, if only two tones (f 1 and f 2 with f 2 > f 1) are present then one may perceive a tone at a frequency of 2f 1 − f 2. This effect is well-known to musicians who exploit it to induce musical harmony. Note that a similar effect is found when complex harmonic structure, with fundamental frequency removed is perceived as having a frequency equal to the fundamental (see Section 4.2.12). 4.2.3 Phase locking As noted previously, neural excitation by the hair cells only occurs at the rarefaction part of the sound wave, at approximately fixed phase, although cells can vary considerably as to their individual phase positions. The average firing frequency for all cells will be of the correct frequency and correct phase. Some cells only fire every two, four or six cycles but this does not alter the overall firing rate [6]. Due to this cycle averaging process that the ear seems to use to distinguish between frequencies, it is possible for the ear to become attuned to a particular frequency, with hair cells firing in time with a rarefaction and not recovering until a short time later. Another rarefaction in this recovery period may be missed. The result being that a louder than usual amplitude will be required in order for a second tone (causing the additional rarefaction) to be heard [4]. This phase locking, as it is called, may well be part of the explanation for why a single tone will suppress tones of similar frequency and lower amplitude, part of the simultaneous masking phenomenon discussed in Section 4.2.8. 4.2.4 Signal processing Cells in the auditory cortex are excited by a variety of acoustic signals, however cells do not in general respond to single tones, but require more complex sounds [6]. Natural processing in the human brain can detect tone start, tone stop, tone pulse, frequency slide up, frequency slide down, amplitude variation and noise burst conditions. One experiment has even determined the position of a group of brain cells that specialises in detecting ‘kissing’ sounds [4]. One wonders at the experimental methods employed. 4.2.5 Temporal integration The ear’s response with respect to time is highly nonlinear. For a tone duration below 200 ms, the intensity required for detection increases with decreasing duration, linearly proportional to the duration multiplied by the intensity required for detection of a constant tone of that frequency. Tones of longer duration – above about 500 ms – are detected irrespective of their duration, complexity and pattern [4]. In a similar way, periods of silence introduced into a constant tone are detectable to an extent which is dependent upon duration up to a duration exceeding about 200 ms. 64 Hearing 4.2.6 Post-stimulatory auditory fatigue After an abnormally loud sound the ear’s response is reduced during a recovery period, after which it is returned to normal [3]. This is termed temporary threshold shift (TTS) [4]. The degree of TTS depends upon the intensity of the fatiguing stimulus, its duration, frequency and the recovery interval. It is also frequency specific, in that the TTS effect is distributed symmetrically about the frequency of the fatiguing tone and is limited to its immediate neighbourhood, but the actual frequency spread of the TTS curve is related to the absolute frequency. It is also related to the amplitude of the tone, and to the logarithm of the duration of the fatiguing tone (although the middle ear reflex muscle action reduces TTS for low frequencies, and a tone duration of over five minutes will produce no appreciable increase in TTS). When the fatiguing noise is broadband, TTS occurs mostly between 4 and 6 kHz, begins immediately and may still be noticeable up to 16 hours after the noise onset [9]. Tones louder than about 110 or 120 dB SPL cause permanent hearing loss, but TTS is most prominent for amplitudes of 90 to 100 dB SPL . 4.2.7 Auditory adaptation The response of a subject to steady-state tones will decline to a minimum over time, although an amplitude of about 30 dB SPL is needed to trigger the effect. For the user this might be noted as a particular interfering sound becoming less noticeable with time. However it is worth noting that this effect appears to be highly subjective, with some subjects experiencing tones disappearing totally whilst others only experience a 3 dB or smaller reduction. The auditory system cannot adapt to truly broadband noise, and literature reports that high frequency tones are easier to adapt to than low frequency tones [4]. Despite these reports concerning broadband noise, it appears anecdotally that on long aeroplane journeys, an initial high level of cabin noise (which is subjectively quite broadband) becomes almost unnoticed by the end of a flight. 4.2.8 Masking Masking deserves a chapter by itself, and indeed will form the basis of psychoacoustic modelling later in this book, but for now a basic introduction is sufficient. Masking in general is defined by the American standards agency as ‘the process by which the threshold of audibility for one sound is raised by the presence of another sound’ and ‘the amount by which the threshold of audibility of sound is raised by the presence of another sound’. The frequency selectivity of the basilar membrane may be considered similar to the functionality provided by a bank of bandpass filters with a threshold of audibility in each filter being dependent upon the noise energy falling within its passband [10]. The filters each have similarly shaped responses with bandwidths approximately 100 Hz up to frequencies of about 1 kHz. Above this, bandwidth increases in a linear fashion with frequency up to a 3 kHz bandwidth at 10 kHz. Each ‘filter’ in the array is termed a critical band filter [11]. We will return to these filters later in Section 4.3.2. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling