Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- 6.4. Higher order statistics
Figure 6.11
LSP tracks for a few seconds of a blackbird’s song sampled at 16 kHz (top) compared to a spectrogram of the same recording (bottom). 6.3.2 Analysis of animal noises A recording of birdsong from the BBC was analysed in a similar fashion to the analysis of the violin note in Section 6.3.1 [10]. After some experimentation, an 18th-order LPC analysis was found to be optimal for LSP track plotting as shown in Figure 6.11. The analysis covers approximately 4.5 seconds of a blackbird’s warble, and shows a clear visual correlation between the birdsong syllables shown in the spectrogram, and in the plotted LSP trajectories. The spectrogram had 50% overlapped windows of size 256 samples, and non-overlapped 256-sample windows were used for the LSP tracking. Other authors report that harmonic analysis has been shown useful in the classification of birdsong [11], and since both plots in Figure 6.11 so clearly show harmonic effects (look for LSP pairs shifting together), the potential for such analysis using these meth- ods is clear. As a final analysis of animal noise, a recording of a very large and angry dog barking was obtained, at some risk, using a 16 kHz sample rate. Again this was analysed by first obtaining LPC coefficients, and then a set of LSPs, for a succession of 64-sample analysis frames. The analysis order was set to eight in this instance since the pitch of the dog bark was rather low, and strongly resonant. Figure 6.12 plots the time-domain waveform above the time-evolution plot of the eight LSP tracks. A bar graph overlaid upon this depicts frame power, and both power and LSP frequencies were scaled to be between 0 and 1 for ease of plotting. 6.4. Higher order statistics 155 1 0.5 0 Amplitude Relative measure –0.5 –1 0.1 0.2 Time, ms Analysis Frame 0.3 0.4 0.5 0.6 line 8 line 7 line 6 line 5 line 4 line 3 line 2 line 1 power 1 0.8 0.6 0.4 0.2 0 40 20 60 100 140 80 120 160 Figure 6.12 Waveform plot (top) and LSP tracks (below) with overlaid frame power plot (bars on lower graph) for a double dog bark, recorded at 16 kHz and an analysis order of eight. The dog bark waveform of Figure 6.12 resembles human speech – in fact a recording of a human shouting ‘woof woof’may look similar. The narrow pairs of LSPs, observable during the periods of highest power, indicate the resonance of strong formants during the loudest part of the bark. The similarity with human speech would indicate that methods of analysis (and processing) used for human speech could well be applicable to animal noises. In particular, vocally generated animal noises, using the same mechanisms as human speech, would be more amenable to speech processing methods than those produced in other ways, such as the abrasive sounds of grasshoppers, dolphin squeaks, pig snorts, snake rattles, and so on. Download 2.66 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling