Applied Speech and Audio Processing: With matlab examples

bet	170/170
Sana	18.10.2023
Hajmi	2,66 Mb.
	#1708320

1 ... 162 163 164 165 166 167 168 169 170

Bog'liq
Applied Speech and Audio Processing With MATLAB Examples ( PDFDrive )

Document Outline

Cover
Half-title
Title
Copyright
Contents
Preface
Acknowledgements
1 Introduction
- 1.1 Digital audio
- 1.2 Capturing and converting sound
- 1.3 Sampling
- 1.4 Summary
- Bibliography
2 Basic audio processing
- 2.1 Handling audio in MATLAB
  - 2.1.1 Recording sound
  - 2.1.2 Storing and replaying sound
  - 2.1.3 Audio file handling
  - 2.1.4 Audio conversion problems
- 2.2 Normalisation
- 2.3 Audio processing
- 2.4 Segmentation
  - 2.4.1 Overlap
  - 2.4.2 Windowing
  - 2.4.3 Continuous filtering
- 2.5 Analysis window sizing
  - 2.5.1 Signal stationarity
  - 2.5.2 Time-frequency resolution
- 2.6 Visualisation
  - 2.6.1 A brief note on axes
  - 2.6.2 Other visualisation methods
    - 2.6.2.1 Correlogram
    - 2.6.2.2 Cepstrum
- 2.7 Sound generation
  - 2.7.1 Pure tone generation
  - 2.7.2 White noise generation
  - 2.7.3 Variable tone generation
  - 2.7.4 Mixing sounds
- 2.8 Summary
- Bibliography
  - MATLAB
  - Signal Processing
  - Audio
3 Speech
- 3.1 Speech production
- 3.2 Characteristics of speech
  - 3.2.1 Speech classification
  - 3.2.2 Amplitude distribution of speech
  - 3.2.3 Types of speech
  - 3.2.4 Frequency distribution
  - 3.2.5 Temporal distribution
- 3.3 Speech understanding
  - 3.3.1 Intelligibility and quality
  - 3.3.2 Measurement of speech quality
  - 3.3.3 Measurement of speech intelligibility
  - 3.3.4 Contextual information, redundancy and vocabulary size
- 3.4 Summary
- Bibliography
4 Hearing
- 4.1 Physical processes
- 4.2 Psychoacoustics
  - 4.2.1 Equal loudness contours
  - 4.2.2 Cochlea echoes
  - 4.2.3 Phase locking
  - 4.2.4 Signal processing
  - 4.2.5 Temporal integration
  - 4.2.6 Post-stimulatory auditory fatigue
  - 4.2.7 Auditory adaptation
  - 4.2.8 Masking
  - 4.2.9 Co-modulation masking release
  - 4.2.10 Non-simultaneous masking
  - 4.2.11 Frequency discrimination
  - 4.2.12 Pitch of complex tones
  - 4.2.13 Binaural masking
  - 4.2.14 Mistuning of harmonics
  - 4.2.15 The precedence effect
  - 4.2.16 Speech perception
- 4.3 Amplitude and frequency models
  - 4.3.1 Loudness
  - 4.3.2 The Bark scale
- 4.4 Psychoacoustic processing
  - 4.4.1 Tone induction
  - 4.4.2 Sound strengthening
  - 4.4.3 Temporal masking release
  - 4.4.4 Masking and two-tone suppression
  - 4.4.5 Use of correlated noise
  - 4.4.6 Binaural masking
- 4.5 Auditory scene analysis
  - 4.5.1 Proximity
  - 4.5.2 Closure
  - 4.5.3 Common fate
  - 4.5.4 Good continuation
- 4.6 Summary
- Bibliography
5 Speech communications
- 5.1 Quantisation
  - 5.1.1 Pulse coded modulation
  - 5.1.2 Delta modulation
  - 5.1.3 Adaptive delta modulation
  - 5.1.4 ADPCM
  - 5.1.5 SB-ADPCM
- 5.2 Parameterisation
  - 5.2.1 Linear prediction
    - 5.2.1.1 The LPC filter
    - 5.2.1.2 LPC stability issues
    - 5.2.1.3 Pre-emphasis of the speech signal
  - 5.2.2 Reflection coefficients
  - 5.2.3 Converting between reflection coefficients and LPCs
  - 5.2.4 Line spectral pairs
    - 5.2.4.1 Derivation of LSPs
    - 5.2.4.2 Generation of LPC coefficients from LSPs
    - 5.2.4.3 Visualisation of line spectral pairs
  - 5.2.5 Quantisation issues
- 5.3 Pitch models
  - 5.3.1 Regular pulse excitation
  - 5.3.2 LTP pitch extraction
    - 5.3.2.1 Pitch extraction
  - 5.3.3 Pitch issues
- 5.4 Analysis-by-synthesis
  - 5.4.1 Basic CELP
    - 5.4.1.1 CELP codebooks
  - 5.4.2 Algebraic CELP
  - 5.4.3 Split codebook schemes
  - 5.4.4 Forward–backward CELP
- 5.5 Summary
- Bibliography
6 Audio analysis
- 6.1 Analysis toolkit
  - 6.1.1 Zero-crossing rate
  - 6.1.2 Frame power
  - 6.1.3 Average magnitude difference function
  - 6.1.4 Spectral measures
  - 6.1.5 Cepstral analysis
  - 6.1.6 LSP-based measures
    - 6.1.6.1 Instantaneous LSP analysis
    - 6.1.6.2 Time-evolved LSP analysis
- 6.2 Speech analysis and classification
  - 6.2.1 Pitch analysis
  - 6.2.2 Joint time-frequency distribution
- 6.3 Analysis of other signals
  - 6.3.1 Analysis of music
  - 6.3.2 Analysis of animal noises
- 6.4 Higher order statistics
- 6.5 Summary
- Bibliography
7 Advanced topics
- 7.1 Psychoacoustic modelling
  - 7.1.1 Spectral analysis
  - 7.1.2 Critical band warping
  - 7.1.3 Critical band function convolution
  - 7.1.4 Equal-loudness pre-emphasis
  - 7.1.5 Intensity-loudness conversion
  - 7.1.6 Masking effect of speech
  - 7.1.7 Other critical-band spreading functions
- 7.2 Perceptual weighting
- 7.3 Speaker classification
- 7.4 Language classification
- 7.5 Speech recognition
  - 7.5.1 Types of speech recogniser
  - 7.5.2 Speech recognition performance
  - 7.5.3 Practical speech recognition
  - 7.5.4 Some basic difficulties
- 7.6 Speech synthesis
  - 7.6.1 Voice playback systems
  - 7.6.2 Text-to-speech systems
  - 7.6.3 Linguistic transcription systems
  - 7.6.4 Practical speech synthesis
- 7.7 Stereo encoding
  - 7.7.1 Stereo and noise
  - 7.7.2 Stereo placement
  - 7.7.3 Stereo encoding
- 7.8 Formant strengthening and steering
  - 7.8.1 Perceptual formant steering
  - 7.8.2 Processing complexity
- 7.9 Voice and pitch changer
  - 7.9.1 PSOLA
  - 7.9.2 LSP-based method
- 7.10 Summary
- Bibliography
References
Index

Download 2,66 Mb.

Do'stlaringiz bilan baham:

1 ... 162 163 164 165 166 167 168 169 170