Applied Speech and Audio Processing: With matlab examples
Download 2.66 Mb. Pdf ko'rish
|
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )
Advanced topics
• Auditory accommodation – coming from a loud machine room into a quiet area often causes speakers to initially misjudge the volume of their voice. Imagine a user walking up to a microphone, extracting the in-ear headphones from their iPod and then speaking their pass phrase. • Coffee, curry, and many other beverages or foods seem to adhere to the lining of the mouth and throat, affecting the qualities of a person’s voice for some time after consumption. To highlight the problems that these intra-voice changes can cause, accomplished speaker recognition researcher S. Furui admits that so far no system has succeeded in modelling these changes [12], the problem being that relaxing the accuracy requirements of a speaker recognition system to allow for variations in a user’s voice will naturally tend to increase the percentage of incorrect classifications. In the vocabulary of the researchers, allowing more ‘sheep’ (valid users that are correctly identified) and less ‘goats’ (valid users that are not correctly identified), also causes more ‘wolves’ (invalid users that can impersonate the sheep). In addition, researchers sometimes refer to ‘lambs’ – the innocent sheep who are often impersonated by the big bad wolves [14]. 7.4 Language classification Automatic language classification, by analysis of recorded speech, has much in common with the automatic speaker classification task of Section 7.3. It can be subdivided in a similar way – namely whether there is any constraint upon what is being said and upon whether there is any constraint upon the identities and number of persons speaking. More importantly, the base set of analysis features and techniques is common. In an extensive review of the language identification research field, Zissman and Berkling [15] cite four auditory cues that can distinguish between languages: Phonology (see Section 3.2) generally differs in that not all languages comprise the same set of phonemes, and undoubtedly they are used in different sequences and arrangements between languages. Morphology meaning that languages tend to have different, but often similar, lexicons. By and large, languages derived from the same root will share a more common morphology. However imported or shared words blur this distinction. Syntax differs in style, sequence and choice of framing words. For example, some languages tend to prefix nouns with prepositions, others do not. Some languages, such as Malay, have far more word repetitions than others, such as English. Prosody is the rate, spacing and duration of language features. Although the researchers do not normally discuss the issue, the research field is con- fronted by a rather difficult problem: those brought up speaking one language may retain the prosody and even syntax when speaking another language. Anyone who has travelled to Pakistan or India and heard locals speaking English would be struck how the prosody |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling