Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- Speech Table 3.3.
3.3
Speech understanding Up to now, this chapter has investigated the production of speech and the characteristics of the produced speech. Ignoring aspects of whichever communications mechanism has been used, this section will now consider some of the non-auditory factors involved in the understanding of speech by humans. That is, the nature of speech structure and how that relates to understanding, rather than the nature of human hearing and perception of speech (which will be covered in Chapter 4). 3.3.1 Intelligibility and quality Firstly it is very important to distinguish between these two terms. Both are correctly used interchangeably at times, but their measurement and dependencies are actually very different. In very simple terms, quality is a measure of the fidelity of speech. This includes how well the speech under examination resembles some original speech, but extends beyond that to how nice the speech sounds. It is a highly subjective measure, but can be approximated objectively. 48 Speech Table 3.3. Average amplitude of several spoken speech types. Score Description Impairment 5 excellent imperceptible 4 good perceptible but not annoying 3 fair slightly annoying 2 poor annoying 1 bad very annoying Intelligibility is a measure of how understandable the speech is. In other words, it con- centrates on the information-carrying content of speech. Some examples should clarify the difference: 1. A string of nonsense syllables, similar to baby speech, spoken by someone with a good speaking voice can sound very pleasant, of extremely high quality, but contains no verbal information, in fact has no intelligibility at all. 2. A recording of speech with a high-frequency buzzing sound in the background will be rated as having low quality even though the words themselves may be perfectly understandable. In this case the intelligibility is high. 3. When choosing a car audio system, you might tune to a favourite radio station to test. Generally the audio system that sounds nicest (of highest quality) would be the one purchased. 4. When the military are in a combat situation, it is usually extremely important to understand the speech from a radio, whereas the quality of the sound is almost totally unimportant. In World War II, crude speech processing (clipping, or filter-clipping) was applied to radios used in aircraft – making the speech sound shrill and screechy – but significantly improving its intelligibility in a noisy cockpit [5]. This effect can often be heard in films and documentaries of the period. Despite stressing the difference between quality and intelligibility in this section, it is useful to note that under most circumstances excellent intelligibility implies excellent quality, and very poor intelligibility implies very poor quality. These are the extremes – between these points the relationship between the two is not straightforward. 3.3.2 Measurement of speech quality Speech quality is normally measured subjectively, in terms of a mean opinion score (MOS). This involves a panel of several listeners, usually placed in a soundproofed room, having the audio recording under evaluation played to them. They will then rate this according to the scale shown in Table 3.3. The MOS score of a particular recording is the mean of the results reported by each of the listeners. Obviously the more listeners, the more accurate (and repeatable) the results will be. This test is standardised by the International Telecommunications Union (ITU) in recommendation P.800, widely used in the audio community. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling