Applied Speech and Audio Processing: With matlab examples

bet	139/170
Sana	18.10.2023
Hajmi	2.66 Mb.
	#1708320

1 ... 135 136 137 138 139 140 141 142 ... 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

Advanced topics

7.3
Speaker classiﬁcation
Speaker classiﬁcation is the automated determination of who is speaking. This is related
to, and overlaps with, two very similar research areas of speaker veriﬁcation and speaker
identiﬁcation. The veriﬁcation task uses a-priori information to determine whether a
given speaker is who he claims to be, with a true or false result. The identiﬁcation task
also uses a-priori information, but in this case, determines which speaker, from a set of
possible speakers, is the one currently talking. The classiﬁcation task by contrast is a
far higher level task. It does not presuppose much, and is simply the act of placing the
current speaker into one or more classes, whether a-priori information is available or
not.
In practice, a-priori information would normally be available in a real system, probably
captured when candidate users are registered with that system. Within such constraints,
there are two further main branches to this research area: one in which the material being
spoken is ﬁxed, and the other in which the material being spoken is unrestricted. In the
unrestricted case the problem is more difﬁcult, and accuracy may well be more closely

170
Advanced topics
related to the amount of captured data that can be analysed than upon the accuracy of
the analysis system employed.
Whichever branch is being considered, the methodology relies upon the way in which
the speech of two speakers differs. If there are instances of both speakers saying the same
words, either through restricting the words being spoken, or perhaps through fortuitous
sample capturing, then the analysis becomes easier. Having two speakers saying the same
phoneme at some time is far more likely to be achievable in the unrestricted case than
speaking the same word. However this would require an accurate method of identifying
and isolating different phonemes – itself a difﬁcult task to perform automatically. In the
unrestricted case, it would be possible to use information pertaining to what is being
said as part of the analysis. At a higher language level, grammar, pronunciation and
phraseology can each help to differentiate among speakers. Some of the progress in
this ﬁeld has been tracked by A. P. A. Broeders of Maastricht University in two papers
reviewing the period 1998 to 2001 [10] and from 2001 to 2004 [11].
The restricted and unrestricted text cases mentioned above are also known as text-
dependent speaker recognition and text-independent speaker recognition in the research
literature. In one of the classical reviews of this ﬁeld, S. Furui not only subdivides the
research ﬁeld similarly to the way we have discussed above, but separately discusses the
ability of several processing methods [12] :

Download 2.66 Mb.

Do'stlaringiz bilan baham:

1 ... 135 136 137 138 139 140 141 142 ... 170