A hidden Markov Model (hmm) based speaker identification system using mobile phone database of North Atlantic Treaty Organization (nato) words
Download 247,06 Kb. Pdf ko'rish
|
- Bu sahifa navigatsiya:
- Data Preparation
Agrawal et al.
© 2013 Acoustical Society of America [DOI: 10.1121/1.4800721] Received 29 Jan 2013; published 2 Jun 2013 Proceedings of Meetings on Acoustics, Vol. 19, 060019 (2013) Page 1 Introduction: Speaker recognition, which can be classified into identification and verification, is the process of automatically recognizing a speaker on the basis of individual information embedded in speech waves. This technique makes it possible to use the speaker's voice to verify their identity and control access to services such as voice dialing, banking by telephone, telephone shopping, database access services, information services, voice mail, security control for confidential information areas, and remote access to computers. It is useful to distinguish between text-dependent speaker verification, where the decision is made using speech corresponding to known text, and text-independent speaker verification, where the speech is unconstrained. [1] In this work, text dependent speaker identification technique has been considered and Hidden Markov Model (HMM) has been used as a classification technique. HTK tool kit[2] using HMM tool box has been used for Hidden Markov Models (HMMs).Individual 23 NATO words(Appendix[1]) spoken by a corpus of 100 speakers have been used to identify the speakers. Data Preparation: Training and testing a speaker recognition system needs a collection of utterances of different speakers. The present system uses a data-set of 23 North Atlantic Treaty Organization [NATO] words [3]. The collected data was recorded by 100 speakers using three channels i.e. a Lapel microphone, a head held microphone and a cell phone. Recording was carried out in a sound treated room environment having S/N=40 db.Recording of 100 speakers (Both Male & Female) of age group between 23 years to 60 years was sampled at the rate of 16kHz. Each speaker uttered each word twenty times. In total 46000 (23*20*100) words have been used to conduct the experiment. Seventy speakers were used for training the system and the other 30 speakers for testing the system. Download 247,06 Kb. Do'stlaringiz bilan baham: |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2025
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling