Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- Speech communications
5.1. Quantisation
93 Figure 5.3 Illustration of an audio waveform being represented using adaptive delta modulation, where a 1 indicates a stepwise increase in signal amplitude and a 0 indicates a stepwise decrease in signal amplitude. Repetition of three like sample values triggers a doubling of stepsize, and a halving of stepsize is triggered by the neighbouring samples which are unlike. The predominant stepsize in the figure is the minimum limit, and there would likewise be a maximum stepsize limit imposed. Figure 5.4 Illustration of an audio waveform being quantised to 16 levels of PCM. calculated, quantised and then added to the accumulator. Evidently the difference can be either positive or negative, and it is thus the quantised difference value that is transmitted at each sample instant. The adaptive nature of the system comes into play during the quantisation stage, by adjusting the size of the quantisation steps. Typically, a 16-bit sample vector would be encoded to 3-, 4- or 5-bit ADPCM values, one per sample. 94 Speech communications As an example, consider the artificial PCM sample stream in Figure 5.4, with each sample being quantised to 16 levels (in reality a sample stream would be quantised to 65 536 levels in a 16-bit system, but for the sake of clarity the illustration is simplified). At each sampled time instant the value of the quantised level represents the waveform amplitude. In this case the PCM sample vector would be {07 07 07 08 09 12 15 15 13 08 04 03 07 12 13 14 14}. If we were to use a differential PCM scheme, then we would calculate the difference between each sample. Assume we start with an accumulator of zero, the differences for the sample vector shown would be {07 00 00 01 01 03 03 00 −02 −05 −04 −01 04 05 01 01 00}. Apart from the first sample, we can see that the differential vector values are much smaller (i.e. we could use fewer bits to represent them). ADPCM uses this methodology but takes it one step further by changing the quant- isation stepsize at each sampling instant based on past history. For an example, the same waveform is coded, using the same number of bits per sample in Figure 5.5. Starting with the same initial quantisation levels, the rule used here is that if the sample value is in the middle four quantisation levels then, for the next sample, the quantisation stepsize is halved, otherwise it is doubled. This allows the quantisation to zoom in to areas of the waveform where only small sample changes occur, but to zoom out sufficiently to capture large changes in amplitude also. In this case the adaptively quantised PCM sample vector would be {07 08 09 10 10 11 11 09 07 05 04 05 14 12 08 10 08} and once we have used differential coding on this it would become {07 01 01 01 00 01 01 −02 −02 −02 −02 01 09 −02 −04 02 −02}. The diagram shows how the quantisation step zooms in on slowly changing waveform amplitudes, and then zooms out to capture large changes in amplitude. This reduces quantisation noise, and still manages to achieve high slew rate (except where a flat waveform is followed by big changes and conversely when large waveform features are followed immediately by much smaller ones). In reality, the adaptation rule would use the last few samples rather than just the last one – and would be used to predict the next waveform value at the encoder, then compare the predicted value to the actual value. ADPCM is actually quite good at tracking a wave- form – especially where the waveform values can be reasonably accurately predicted such as in harmonic music. 5.1.5 SB-ADPCM The SB-ADPCM coder, introduced through ITU standard G.722, includes two sepa- rate ADPCM coding units. Each one operates on half of the frequency range (which are 0–4 kHz and 4–8 kHz respectively), but the bit weighting and quantisation differ by a factor of 4 in favour of the lower band, which is thus capable of better audio fidelity, particularly because it conveys frequencies that are more important to speech (see Section 3.2.4). |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling