Applied Speech and Audio Processing: With matlab examples
Psychoacoustic processing
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
Psychoacoustic processing
The use of psychoacoustic criteria to improve communications systems, or rather to tar- get the available resources to subjectively more important areas, is now common. Many telephone communications systems use A-law compression. Philips and Sony have re- spectively produced the DCC (digital compact cassette) and the MiniDisc formats which both make extensive use of equal loudness contours, and masking information to com- press high quality audio [22]. Whilst neither of these were runaway market successes, they introduced psychoacoustics to the music industry, and paved the way for solid state music players such as the Creative Technologies Zen micro, Apple iPod and various devices from iRiver, Philips, and others too numerous to mention. Most of these devices use the popular MP3 compression format (although its successor, MP4, and several proprietary alternatives exist). It is important to note that all of these music compression methods have something in common: they all use psychoacoustic criteria. All take account of masking thresholds, some consider harmonic relationships, and others exploit binaural masking. In the remainder of this section, several promising psychoacoustic techniques with potential for use in speech compression are discussed. Not all of these have yet been exploited commercially, and few are in mainstream research. 4.4.1 Tone induction Knowledge of the residue effect (Section 4.2.12) allows one to induce a low frequency tone by the addition of higher frequency tones. Only three or so high frequency tones are required, and the technique is useful where it would otherwise be necessary to directly add an extremely loud low frequency tone in order to be heard above a low frequency noise. One application of this is in the construction of an artificial speech formant (form- ants apparently being regarded by Klatt, a leading audio and speech researcher, as being the most important aspect of speech recognition [4]). This application has not yet been substantiated. Critical to its success is the location within the auditory cortex of the mech- anisms causing the residue effect. If these mechanisms occur posterior to the mechanisms of speech recognition, then formant induction is unlikely to be successful, and the lis- tener would simply hear speech degraded by high frequency tones. Another application is in the induction of bass notes using small speaker cones. Generally, for low frequency bass notes to be reproduced, a large loudspeaker cone is needed. However such items are expensive, unwieldy, and power hungry. This method could potentially be used to allow smaller speakers to induce the kind of sound normally requiring larger, more costly, speakers. 4.4.2 Sound strengthening As stated in Section 4.2.12, adding harmonics to a complex tone, or adding geo- metrically related tones, does not change the perceived frequency of a sound, but |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling