Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
Basic audio processing
on the previous frame. We thus ensure a smooth filtering operation across frames. The resulting output speech should be free of the clicks and discontinuities evident with the previous code. 2.5 Analysis window sizing We have discussed at the end of Section 2.3 several motivations for splitting audio into segments for processing, but we did not consider how big those segments, frames or analysis windows, should be. Generally, most audio algorithms (and definitely Matlab -based processing) will operate more efficiently on larger blocks of data. There would therefore be a natural tendency toward using larger analysis frames, tempered by issues such as latency which is a critical consideration in telephony processing and similar applications. Another major reason for limiting analysis window size is where the characteristics of a signal change during that analysis window. This is perhaps best illustrated in the Infobox Visualisation of signals on page 32 where a complex frequency-time pattern is present, but an analysis window which is large enough to span across that pattern will hide the detail when an FFT is performed. There are two important points to be explained here. The first is that of signal sta- tionarity and the second is time-frequency resolution. We will consider each in turn. 2.5.1 Signal stationarity Most signals requiring analysis are continually changing. A single sustained note played on a musical instrument is stationary, but quite clearly when one note is replaced by the next one, the signal characteristics have changed in some way (at least in frequency, but possibly also in amplitude, tone, timbre, and so on). For an application analysing recorded music to determine which note is currently being played, it would make sense to segment the recording roughly into analysis windows of length equal to the duration of a single note, or less. For each analysis window we could perform an FFT, and look for peaks in the spectrum. However if we analysed longer duration windows, we may end up performing an FFT that spans across two notes, and be unable to determine which is either note. At very least we would have a confused ‘picture’ of the sound being analysed – just as the example FFT in the Infobox did not reveal the full detail of the sound being analysed. More importantly, the theory that gives rise to the FFT assumes that the frequency components of the signal are unchanging across the analysis window of interest. Any deviation from this assumption would result in an inaccurate determination of the fre- quency components. These points together reveal the importance of ensuring that an analysis window leading to FFT be sized so that the signal is stationary across the period of analysis. In practice many audio signals do not tend to remain stationary for long, and thus smaller analysis windows are necessary to capture the rapidly changing details. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling