In the mid to late 1990s, personal computers started to become powerful enough to enable users to speak to them and for the computers to speak back
Download 44,5 Kb.
|
1 2
Bog'liqapplying computer
- Bu sahifa navigatsiya:
- Introduction to Computer Speech Recognition
In the mid to late 1990s, personal computers started to become powerful enough to enable users to speak to them and for the computers to speak back. While speech technology is still far from delivering natural, unstructured conversations with computers, it currently is delivering some very real benefits in real applications. For example:
The two key underlying technologies behind speech-enabling computer applications are speech recognition (SR) and speech synthesis. These technologies are introduced in the following sections.
Introduction to Computer Speech RecognitionSpeech recognition (SR) is the process of converting spoken language into printed text. Speech recognition, also called speech-to-text recognition, involves:
The figure below illustrates a general overview of the process. Recognizers (also known as speech recognition engines) are the software drivers that convert the acoustical signal to a digital signal and deliver recognized speech as text to an application. Most recognizers support continuous speech recognition, meaning that users can speak naturally into a microphone at the speed of most conversations. Isolated or discrete speech recognizers require the user to pause after each word, and are currently being replaced by continuous speech engines. Continuous speech recognition engines currently support two modes of speech recognition:
Using dictation mode, users can dictate memos, letters, and e-mail messages, as well as enter data. The size of the recognizer's grammar limits the possibilities of what can be recognized. Most recognizers that support dictation mode are speaker-dependent, meaning that accuracy varies depending on the user's speaking patterns and accent. To ensure the most accurate recognition, the application must create or access a speaker profile that contains information about the user's speech patterns. Using command and control mode, users can speak commands that control the functions of an application. Implementing command and control mode is the easiest way for developers to integrate a speech interface into an existing application because developers can limit the content of the recognition grammar to the available commands. This limitation has several advantages:
Download 44,5 Kb. Do'stlaringiz bilan baham: |
1 2
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2025
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling