Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
Speech communications
Table 5.1. SEGSNR resulting from different degrees of uniform quantisation of LPC and LSP parameters. Bits/parameter LPC LSP 4 – −6.26 5 −535 −2.14 6 −303 1.24 7 −6.04 8.28 8 −10.8 15.9 10 19.7 20.5 12 22.2 22.2 16 22.4 22.4 5.2.5 Quantisation issues Since LSPs are most often used in speech coders, where they are quantised prior to transmission, it is useful to explore the quantisation properties of the representation. In order to do this, we can use some representative recorded speech, quantise in turn by different amounts, dequantise to recreate speech, and in each case compare the original and dequantised speech. In fact, this has been done using a large pre-recorded speech database called TIMIT [6]. Several thousand utterances were analysed, LPC and then LSPs derived, and quan- tised by different amounts. The quantisation scheme used in this case was uniform quantisation, where each parameter is represented by an equal number of bits, ranging in turn, from 4 to 16 bits per parameter. Tenth-order analysis was used, and a segmental signal-to-noise ratio (SEGSNR, dis- cussed in Section 3.3.2) determined between the original and dequantised speech. Both LSP and LPC quantisation were tested. Table 5.1 lists the results, where the more positive the SEGSNR value, the more closely the dequantised speech matches the original speech. These results clearly in- dicate that LPCs are far more susceptible to quantisation effects than are LSPs: down around 5 bits per parameter, the recreated speech resulting from LPC quantisation ex- hibits sharp spikes of noise and oscillation, totally unlike the original speech. Hence the huge difference between original and recreated speech evidenced by the SEGSNR value of −535. By contrast the LSP representation, with an SEGSNR of −2.14 indicates quite easily understandable speech. This substantiates the assertion in Section 5.2.4 that LSPs are favoured for their superior quantisation characteristics. Finally, note that both approaches in the table achieve a SEGSNR level of 22.4 when more bits are used for quantisation, plainly indicating the limit of achievable SEGSNR for the analysis process used (i.e. the window function, autocorrelation, number of parameters, and so on). LSPs may be available in either the frequency domain or the cosine domain (depend- ing on the method of solving the polynomial roots). Each line’s value can be quantised independently (scalar quantisation) on either a uniform or a non-uniform scale [7] which can also be dynamically adapted. Alternatively, lines can be grouped together and vector quantised with either static or adaptive codebooks [8]. Vector quantisation groups sets of |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling