Applied Speech and Audio Processing: With matlab examples

bet	92/170
Sana	18.10.2023
Hajmi	2,66 Mb.
	#1708320

1 ... 88 89 90 91 92 93 94 95 ... 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

Speech communications
Table 5.1. SEGSNR resulting from different degrees of uniform quantisation of LPC and
LSP parameters.
Bits/parameter
LPC
LSP
4
–
−6.26
5
−535
−2.14
6
−303
1.24
7
−6.04
8.28
8
−10.8
15.9
10
19.7
20.5
12
22.2
22.2
16
22.4
22.4
5.2.5
Quantisation issues
Since LSPs are most often used in speech coders, where they are quantised prior to
transmission, it is useful to explore the quantisation properties of the representation. In
order to do this, we can use some representative recorded speech, quantise in turn by
different amounts, dequantise to recreate speech, and in each case compare the original
and dequantised speech.
In fact, this has been done using a large pre-recorded speech database called TIMIT
[6]. Several thousand utterances were analysed, LPC and then LSPs derived, and quan-
tised by different amounts. The quantisation scheme used in this case was uniform
quantisation, where each parameter is represented by an equal number of bits, ranging
in turn, from 4 to 16 bits per parameter.
Tenth-order analysis was used, and a segmental signal-to-noise ratio (SEGSNR, dis-
cussed in Section 3.3.2) determined between the original and dequantised speech. Both
LSP and LPC quantisation were tested.
Table 5.1 lists the results, where the more positive the SEGSNR value, the more
closely the dequantised speech matches the original speech. These results clearly in-
dicate that LPCs are far more susceptible to quantisation effects than are LSPs: down
around 5 bits per parameter, the recreated speech resulting from LPC quantisation ex-
hibits sharp spikes of noise and oscillation, totally unlike the original speech. Hence
the huge difference between original and recreated speech evidenced by the SEGSNR
value of
−535. By contrast the LSP representation, with an SEGSNR of −2.14 indicates
quite easily understandable speech. This substantiates the assertion in Section 5.2.4 that
LSPs are favoured for their superior quantisation characteristics. Finally, note that both
approaches in the table achieve a SEGSNR level of 22.4 when more bits are used for
quantisation, plainly indicating the limit of achievable SEGSNR for the analysis process
used (i.e. the window function, autocorrelation, number of parameters, and so on).
LSPs may be available in either the frequency domain or the cosine domain (depend-
ing on the method of solving the polynomial roots). Each line’s value can be quantised
independently (scalar quantisation) on either a uniform or a non-uniform scale [7] which
can also be dynamically adapted. Alternatively, lines can be grouped together and vector
quantised with either static or adaptive codebooks [8]. Vector quantisation groups sets of

Download 2,66 Mb.

Do'stlaringiz bilan baham:

1 ... 88 89 90 91 92 93 94 95 ... 170