Applied Speech and Audio Processing: With matlab examples

bet	142/170
Sana	18.10.2023
Hajmi	2.66 Mb.
	#1708320

1 ... 138 139 140 141 142 143 144 145 ... 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

Advanced topics
• Auditory accommodation – coming from a loud machine room into a quiet area
often causes speakers to initially misjudge the volume of their voice. Imagine a user
walking up to a microphone, extracting the in-ear headphones from their iPod and
then speaking their pass phrase.
• Coffee, curry, and many other beverages or foods seem to adhere to the lining of
the mouth and throat, affecting the qualities of a person’s voice for some time after
consumption.
To highlight the problems that these intra-voice changes can cause, accomplished
speaker recognition researcher S. Furui admits that so far no system has succeeded in
modelling these changes [12], the problem being that relaxing the accuracy requirements
of a speaker recognition system to allow for variations in a user’s voice will naturally
tend to increase the percentage of incorrect classiﬁcations. In the vocabulary of the
researchers, allowing more ‘sheep’ (valid users that are correctly identiﬁed) and less
‘goats’ (valid users that are not correctly identiﬁed), also causes more ‘wolves’ (invalid
users that can impersonate the sheep). In addition, researchers sometimes refer to ‘lambs’
– the innocent sheep who are often impersonated by the big bad wolves [14].
7.4
Language classiﬁcation
Automatic language classiﬁcation, by analysis of recorded speech, has much in common
with the automatic speaker classiﬁcation task of Section 7.3. It can be subdivided in a
similar way – namely whether there is any constraint upon what is being said and upon
whether there is any constraint upon the identities and number of persons speaking. More
importantly, the base set of analysis features and techniques is common.
In an extensive review of the language identiﬁcation research ﬁeld, Zissman and
Berkling [15] cite four auditory cues that can distinguish between languages:
Phonology (see Section 3.2) generally differs in that not all languages comprise the
same set of phonemes, and undoubtedly they are used in different sequences and
arrangements between languages.
Morphology meaning that languages tend to have different, but often similar, lexicons.
By and large, languages derived from the same root will share a more common
morphology. However imported or shared words blur this distinction.
Syntax differs in style, sequence and choice of framing words. For example, some
languages tend to preﬁx nouns with prepositions, others do not. Some languages,
such as Malay, have far more word repetitions than others, such as English.
Prosody is the rate, spacing and duration of language features.
Although the researchers do not normally discuss the issue, the research ﬁeld is con-
fronted by a rather difﬁcult problem: those brought up speaking one language may retain
the prosody and even syntax when speaking another language. Anyone who has travelled
to Pakistan or India and heard locals speaking English would be struck how the prosody

Download 2.66 Mb.

Do'stlaringiz bilan baham:

1 ... 138 139 140 141 142 143 144 145 ... 170