Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- Infobox 5.2
Speech communications
are then transmitted from coder to decoder, where they are used to recreate a similar (but not identical) waveform. Apart from the likelihood of the transmitted parameters requiring fewer bits to rep- resent than a directly coded waveform, parameterisation can hold two other benefits. Firstly if the parameters are chosen to be particularly relevant to the underlying sound (i.e. a better match to the speech signal) then the difference between the original and coded-decoded signal can be reduced, leading to better fidelity. Second is that the method of quantising the parameters themselves – or rather the number of bits assigned to each parameter – can be carefully chosen to improve quality. In more simple terms, when given a ‘pool’ of bits that are allowed to represent the parameters being transmitted from encoder to decoder, it is possible to ‘spend’ more bits on parameters that are more impor- tant to overall quality (where the measure of quality itself will be considered alongside intelligibility in Chapter 6). In the more advanced speech coding algorithms, parame- ters are chosen that match the component signals within the speech, or that match the important aspects of the human auditory system, or perhaps cater a little to both speech production and speech understanding. Figure 5.6 shows the process used in a great many modern speech coding systems, where the original speech signal is split into components that describe the overall gain or amplitude of the speech vector being coded, the pitch information, vocal tract reso- nances, and lung excitation. Each of these parameters is to be quantised and transmitted from encoder to decoder. At the decoder the parameters are used together to recreate synthesised speech. Infobox 5.2 PCM-based speech coding standards As you might expect there are a plethora of different speech coding techniques based around PCM. The main standardisation body in this area is the International Telecommunications Union (ITU) since the primary driver in voice communications worldwide has been telecommunications. In the past the ITU was known as the Comité consultatif international téléphonique et télégraphique or CCITT. Even in English the name is a mouthful: ‘International Telegraph and Telephone Consul- tative Committee’, hence the move to the far simpler title ITU. Several of the more common ITU standards, all beginning with the prefix G are shown in the following table: Name Description G.711 8 kHz sampling A-law and µ-law compression G.721 32 kbits/s ADPCM standard (replaced by G.726) G.723 24 and 40 kbits/s ADPCM (replaced by G.726) G.722 64 kbits/s SB-ADPCM sampled at 16 kHz G.726 24, 32 and 40 kbits/s ADPCM sampled at 8 kHz Several other ITU speech standards are shown on page 131. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling