Applied Speech and Audio Processing: With matlab examples
Psychoacoustic modelling
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- 7.1. Psychoacoustic modelling 161 Figure 7.1
7.1
Psychoacoustic modelling Remember back in Section 4.2 we claimed that this marriage of the art of psychology and the science of acoustics was important in forming a link between the purely physical domain of sound and the experience of a listener? In this section we will examine further to see why and how that happens. It follows that a recording of a physical sound wave – which is a physical representation of the audio – contains elements which are very relevant to a listener, and elements which are not. At one extreme, some of the recorded sound may be inaudible to a listener. Equally, it is possible that a recording of sound does not contain some of the original audible features. This may be one reason why many listeners would prefer to spend a sum of money listening to live music rather than an equivalent sum to purchase a compact disc (CD) which allows them to listen to the music again and again. Still, psychoacoustics as a study is predominantly computer-based: it rarely considers information which has not been recorded to computer, but is often used to identify parts of recorded information which are not useful to listeners. One use of psychoacoustics is illustrated simply in Figure 7.1 showing a time domain waveform with a sudden loud pop or similar sound. A listening human ear will gener- ally suffer from post-stimulatory masking (see Section 4.2.10) due to the sudden onset 160 7.1. Psychoacoustic modelling 161 Figure 7.1 Illustration of post-stimulatory masking in the time domain detected (a) and inaudible information stripped out (b). of the loud sound: quieter sounds following soon after this will tend to be inaudible. Figure 7.1(a) shows a detected masking region (with sound falling in the shaded region being inaudible). Any speech processing system capable of detecting this region could practically strip out all audio within the region, as in Figure 7.1(b) resulting in something that sounds identical to the original, but is smaller in storage requirements. For speech compression systems, this is an enormous advantage: why code/process/represent/store or transmit anything which is inaudible? The popular MP3 format detects many such inaudible regions in original music, and strips them out, com- pressing only the remaining audio. This mechanism allows for very low bitrate repre- sentation of music (and indeed speech): every compressed bit is fairly sure to contribute to overall sound quality. Readers should, at this point, understand the rationale behind psychoacoustics: to obtain some advantage in coding or compression, through the consideration of the dif- ference between what is physically stored and what is heard by a listener. We have illustrated this with a post-stimulatory masking example – but be aware that potentially most of the hearing characteristics described in Chapter 4 can lead to a psychoacoustic effect. They are most definitely not all used in practice, but at least the potential exists. Psychoacoustics became a ‘hot’ topic during the 1990s, and has since been applied to almost every sphere in electronic audio systems. A comprehensive list of systems using psychoacoustic-based algorithms would be extremely long, but the following few categories encompass the vast majority: • compression of high-fidelity audio; 1 • compression of speech; • audio steganography (data hiding, or ‘watermarking’, in audio) [1]; • active noise cancellation systems; • speech intelligibility improvement in noise [2]. 1 Many purists maintain that high-fidelity audio can’t be compressed, and some others argue that the long play (LP) record already represents sufficient compression! Here we take the view that such things are possible. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling