Document Outline - Cover
- Half-title
- Title
- Copyright
- Contents
- Preface
- Acknowledgements
- 1 Introduction
- 1.1 Digital audio
- 1.2 Capturing and converting sound
- 1.3 Sampling
- 1.4 Summary
- Bibliography
- 2 Basic audio processing
- 2.1 Handling audio in MATLAB
- 2.1.1 Recording sound
- 2.1.2 Storing and replaying sound
- 2.1.3 Audio file handling
- 2.1.4 Audio conversion problems
- 2.2 Normalisation
- 2.3 Audio processing
- 2.4 Segmentation
- 2.4.1 Overlap
- 2.4.2 Windowing
- 2.4.3 Continuous filtering
- 2.5 Analysis window sizing
- 2.5.1 Signal stationarity
- 2.5.2 Time-frequency resolution
- 2.6 Visualisation
- 2.6.1 A brief note on axes
- 2.6.2 Other visualisation methods
- 2.6.2.1 Correlogram
- 2.6.2.2 Cepstrum
- 2.7 Sound generation
- 2.7.1 Pure tone generation
- 2.7.2 White noise generation
- 2.7.3 Variable tone generation
- 2.7.4 Mixing sounds
- 2.8 Summary
- Bibliography
- MATLAB
- Signal Processing
- Audio
- 3 Speech
- 3.1 Speech production
- 3.2 Characteristics of speech
- 3.2.1 Speech classification
- 3.2.2 Amplitude distribution of speech
- 3.2.3 Types of speech
- 3.2.4 Frequency distribution
- 3.2.5 Temporal distribution
- 3.3 Speech understanding
- 3.3.1 Intelligibility and quality
- 3.3.2 Measurement of speech quality
- 3.3.3 Measurement of speech intelligibility
- 3.3.4 Contextual information, redundancy and vocabulary size
- 3.4 Summary
- Bibliography
- 4 Hearing
- 4.1 Physical processes
- 4.2 Psychoacoustics
- 4.2.1 Equal loudness contours
- 4.2.2 Cochlea echoes
- 4.2.3 Phase locking
- 4.2.4 Signal processing
- 4.2.5 Temporal integration
- 4.2.6 Post-stimulatory auditory fatigue
- 4.2.7 Auditory adaptation
- 4.2.8 Masking
- 4.2.9 Co-modulation masking release
- 4.2.10 Non-simultaneous masking
- 4.2.11 Frequency discrimination
- 4.2.12 Pitch of complex tones
- 4.2.13 Binaural masking
- 4.2.14 Mistuning of harmonics
- 4.2.15 The precedence effect
- 4.2.16 Speech perception
- 4.3 Amplitude and frequency models
- 4.3.1 Loudness
- 4.3.2 The Bark scale
- 4.4 Psychoacoustic processing
- 4.4.1 Tone induction
- 4.4.2 Sound strengthening
- 4.4.3 Temporal masking release
- 4.4.4 Masking and two-tone suppression
- 4.4.5 Use of correlated noise
- 4.4.6 Binaural masking
- 4.5 Auditory scene analysis
- 4.5.1 Proximity
- 4.5.2 Closure
- 4.5.3 Common fate
- 4.5.4 Good continuation
- 4.6 Summary
- Bibliography
- 5 Speech communications
- 5.1 Quantisation
- 5.1.1 Pulse coded modulation
- 5.1.2 Delta modulation
- 5.1.3 Adaptive delta modulation
- 5.1.4 ADPCM
- 5.1.5 SB-ADPCM
- 5.2 Parameterisation
- 5.2.1 Linear prediction
- 5.2.1.1 The LPC filter
- 5.2.1.2 LPC stability issues
- 5.2.1.3 Pre-emphasis of the speech signal
- 5.2.2 Reflection coefficients
- 5.2.3 Converting between reflection coefficients and LPCs
- 5.2.4 Line spectral pairs
- 5.2.4.1 Derivation of LSPs
- 5.2.4.2 Generation of LPC coefficients from LSPs
- 5.2.4.3 Visualisation of line spectral pairs
- 5.2.5 Quantisation issues
- 5.3 Pitch models
- 5.3.1 Regular pulse excitation
- 5.3.2 LTP pitch extraction
- 5.3.3 Pitch issues
- 5.4 Analysis-by-synthesis
- 5.4.1 Basic CELP
- 5.4.2 Algebraic CELP
- 5.4.3 Split codebook schemes
- 5.4.4 Forward–backward CELP
- 5.5 Summary
- Bibliography
- 6 Audio analysis
- 6.1 Analysis toolkit
- 6.1.1 Zero-crossing rate
- 6.1.2 Frame power
- 6.1.3 Average magnitude difference function
- 6.1.4 Spectral measures
- 6.1.5 Cepstral analysis
- 6.1.6 LSP-based measures
- 6.1.6.1 Instantaneous LSP analysis
- 6.1.6.2 Time-evolved LSP analysis
- 6.2 Speech analysis and classification
- 6.2.1 Pitch analysis
- 6.2.2 Joint time-frequency distribution
- 6.3 Analysis of other signals
- 6.3.1 Analysis of music
- 6.3.2 Analysis of animal noises
- 6.4 Higher order statistics
- 6.5 Summary
- Bibliography
- 7 Advanced topics
- 7.1 Psychoacoustic modelling
- 7.1.1 Spectral analysis
- 7.1.2 Critical band warping
- 7.1.3 Critical band function convolution
- 7.1.4 Equal-loudness pre-emphasis
- 7.1.5 Intensity-loudness conversion
- 7.1.6 Masking effect of speech
- 7.1.7 Other critical-band spreading functions
- 7.2 Perceptual weighting
- 7.3 Speaker classification
- 7.4 Language classification
- 7.5 Speech recognition
- 7.5.1 Types of speech recogniser
- 7.5.2 Speech recognition performance
- 7.5.3 Practical speech recognition
- 7.5.4 Some basic difficulties
- 7.6 Speech synthesis
- 7.6.1 Voice playback systems
- 7.6.2 Text-to-speech systems
- 7.6.3 Linguistic transcription systems
- 7.6.4 Practical speech synthesis
- 7.7 Stereo encoding
- 7.7.1 Stereo and noise
- 7.7.2 Stereo placement
- 7.7.3 Stereo encoding
- 7.8 Formant strengthening and steering
- 7.8.1 Perceptual formant steering
- 7.8.2 Processing complexity
- 7.9 Voice and pitch changer
- 7.9.1 PSOLA
- 7.9.2 LSP-based method
- 7.10 Summary
- Bibliography
- References
- Index
Do'stlaringiz bilan baham: |