Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
- Bu sahifa navigatsiya:
- Advanced topics
7.3
Speaker classification Speaker classification is the automated determination of who is speaking. This is related to, and overlaps with, two very similar research areas of speaker verification and speaker identification. The verification task uses a-priori information to determine whether a given speaker is who he claims to be, with a true or false result. The identification task also uses a-priori information, but in this case, determines which speaker, from a set of possible speakers, is the one currently talking. The classification task by contrast is a far higher level task. It does not presuppose much, and is simply the act of placing the current speaker into one or more classes, whether a-priori information is available or not. In practice, a-priori information would normally be available in a real system, probably captured when candidate users are registered with that system. Within such constraints, there are two further main branches to this research area: one in which the material being spoken is fixed, and the other in which the material being spoken is unrestricted. In the unrestricted case the problem is more difficult, and accuracy may well be more closely 170 Advanced topics related to the amount of captured data that can be analysed than upon the accuracy of the analysis system employed. Whichever branch is being considered, the methodology relies upon the way in which the speech of two speakers differs. If there are instances of both speakers saying the same words, either through restricting the words being spoken, or perhaps through fortuitous sample capturing, then the analysis becomes easier. Having two speakers saying the same phoneme at some time is far more likely to be achievable in the unrestricted case than speaking the same word. However this would require an accurate method of identifying and isolating different phonemes – itself a difficult task to perform automatically. In the unrestricted case, it would be possible to use information pertaining to what is being said as part of the analysis. At a higher language level, grammar, pronunciation and phraseology can each help to differentiate among speakers. Some of the progress in this field has been tracked by A. P. A. Broeders of Maastricht University in two papers reviewing the period 1998 to 2001 [10] and from 2001 to 2004 [11]. The restricted and unrestricted text cases mentioned above are also known as text- dependent speaker recognition and text-independent speaker recognition in the research literature. In one of the classical reviews of this field, S. Furui not only subdivides the research field similarly to the way we have discussed above, but separately discusses the ability of several processing methods [12] : Download 2.66 Mb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling