Applied Speech and Audio Processing: With matlab examples

bet	114/170
Sana	18.10.2023
Hajmi	2.66 Mb.
	#1708320

1 ... 110 111 112 113 114 115 116 117 ... 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

6.1. Analysis toolkit
137
This is illustrated in Figure 6.1(a) where the zero crossings of a sinewave are counted
over a certain analysis time. In this case the fundamental frequency of the sinewave causes
nine crossings across the plot. However in the presence of additive noise, the ‘wobble’
in the signal as it crosses the zero-axis causes several false counts. In Figure 6.1(b) this
leads to an erroneous estimate of signal fundamental frequency – in fact an estimate that
would be three times too high.
In Matlab, determining the ZCR is relatively easy, although not particularly elegant:
function [zcr]=zcr(segment)
zc=0;
for m=1:length(segment)-1
if segment(m)*segment(m+1) > 0
zc=zc+0;
else
zc=zc+1;
end
zcr=zc/length(segment);
end
To illustrate the Matlab zcr() function above was applied to a recording of speech.
The speech was segmented into non-overlapping analysis windows of size 128 samples,
and the ZCR determined for each window. The results, plotted in Figure 6.2, show a
good correspondence between the ZCR measure and the frequencies present in the speech
– higher frequency regions of the recorded speech, such as the /ch/ sound, have a higher
ZCR measure.
A pragmatic solution to the problem of noise is to apply a threshold about the zero-
axis. In essence, this introduces a region of hysteresis whereby a single count is made
only when the signal drops below the maximum threshold and emerges below the min-
imum threshold, or vice versa. This is called threshold-crossing rate (TCR), and is
illustrated in Figure 6.3.
In practice, the advantage of TCR for noise reduction is often achieved by low-pass
ﬁltering the speech before a ZCR is calculated. This knocks out the high frequency
noise or ‘bounce’ on the signal. Since ZCR is used as a rough approximation of the
fundamental pitch of an audio signal, bounds for ﬁltering can be established through
knowing the extent of the expected maximum. In speech, it has to be stressed that the
ﬁltered ZCR (or TCR) measure provides an approximate indication of the content of the
speech signal, with unvoiced speech tending to result in high ZCR values, and voiced
speech tending to result in low ZCR values. Noise also tends to produce high ZCR
values, and thus it is difﬁcult to use ZCR for analysis of noisy speech, signiﬁcantly
limiting its use in practical applications.

138

Download 2.66 Mb.

Do'stlaringiz bilan baham:

1 ... 110 111 112 113 114 115 116 117 ... 170