Audio Signal Processing

Preeti Rao

doi:10.1007/978-3-540-75398-8_8

Outline

Audio Signal Processing

Preeti Rao

2008, Studies in Computational Intelligence

https://doi.org/10.1007/978-3-540-75398-8_8

visibility

…

description

21 pages

link

1 file

Abstract
AI

This paper explores the field of audio signal processing, emphasizing the importance of auditory scene analysis and its applications in various domains such as speech recognition, music transcription, and multimedia data retrieval. It highlights key aspects of digital audio signal processing, including audio classification, compression methods, and the characteristics of audio signals crucial for effective processing. The discussion also covers time-frequency representations and feature extraction pertinent to audio classification systems.

Figures (4)

Fig. 1. The auditory field in the frequency-intensity plane. The sound pressure level is measured in dB with respect to the standard reference pressure level of 20 microPascals.

Table 1. A description of the audio events corresponding to Figure 1. The spectrogram by means of its time-frequency analysis displays the spectro-temporal properties of acoustic events that may overlap in time and frequency. The choice of the analysis window duration dictates the trade-off between the frequency resolution of steady-state content versus the time res- olution of rapidly time-varying events or transients.

Fig. 2. (a) Waveform, and (b) spectrogram of the audio segment described in Ta- ble 1. The vertical dotted lines indicate the starting instants of new events. The spectrogram relative intensity scale appears at lower right. While the spectrogram and auditory signal representations discussed in the previous section are good for visualization of audio content, they have a high dimensionality which makes them unsuitable for direct application to clas- sification. Ideally, we would like to extract low-dimensional features from these representations (or even directly from the acoustical signal) which retain only the important distinctive characteristics of the intended audio classes. Reduced-dimension, decorrelated spectral vectors obtained using a inear transformation of a spectrogram have been proposed in MPEG-7, the audiovisual content description standard [5], [6].

Fig. 3. Audio feature extraction procedure (adapted from [8]).

References (27)

Oppenheim A V and Lim J S, The Importance of Phase in Signals. Proc of the IEEE 69(5):529-550
Moore BCJ (2003) An Introduction to the Psychology of Hearing. Academic Press, San Diego
Patterson R D (2000) Auditory Images: How Complex Sounds Are Repre- sented in the Auditory System. J Acoust Soc Japan (E) 21(4)
Lyon R F, Dyer L (1986) Experiments with a Computational Model of the Cochlea. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
Martinez J M (2002) Standards -MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimedia 9(3):83-93
Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2003) Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification. Proc of the Intl Conf on Multimedia and Expo (ICME)
Wang L, Brown G (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE Press, New York
McKinney M F, Breebaart J (2003) Features for Audio and Music Classifica- tion. Proc of the Intl Symp on Music Information Retrieval (ISMIR)
Tzanetakis G, Cook P (2002) Musical Genre Classification of Audio Signals. IEEE Trans on Speech and Audio Processing 10(5):293-302
Burred J J, Lerch A (2004) Hierarchical Automatic Audio Signal Classifica- tion. J Audio Engineering Society 52(7/8):724-739
Logan B (2000) Mel frequency cepstral coefficients for music modeling. Proc of the Intl Symp on Music Information Retrieval (ISMIR)
Zwicker E, Scharf B (1965) A Model of Loudness Summation. Psychological Review 72:3-26
Klapuri A P (2005) A Perceptually Motivated Multiple-F0 Estimation Method for Polyphonic Music Signals. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPA)
Duda R, Hart P, Stork D (2000) Pattern Classification. Wiley, New York
El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music Discrimina- tion for Multimedia Applications. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
Williams G, Ellis D (1999) Speech/music Discrimination based on Posterior Probability Features. Proc of Eurospeech
Scheirer E, Slaney M (1997) Construction and Evaluation of a Robust Mul- tifeature Speech/Music Discriminator. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
Chou W, Gu L (2001) Robust Singing Detection in Speech/Music Discrimina- tor Design. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
Zhang T, Kuo C C J (2001) Audio Content Analysis for Online AudioVisual Data Segmentation and Classification. IEEE Trans on Speech and Audio Processing 9(4):441-457
Wold E, Blum T, Keisler D, Wheaton J (1996) Content-based Classification, Search and Retrieval of Audio. IEEE Multimedia 3(3):27-36
Peeters G, McAdams S, Herrera P (2000) Instrument Sound Description in the Context of MPEG-7. Proc of the Intl Computer Music Conference (ICMC)
Dowling W J (1978) Scale and Contour: Two Components of a Theory of Memory for Melodies. Psychological Review 85:342-389
Pradeep P, Joshi M, Hariharan S, Dutta-Roy S, Rao P (2007) Sung Note Segmentation for a Query-By-Humming System. Proc of the Intl Workshop on Artificial Intelligence and Music (Music-AI) in IJCAI
Klapuri A P (1999) Sound Onset Detection by Applying Psychoacoustic Knowledge. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
de Cheveigne A, Kawahara H (1999) Multiple period estimation and pitch perception model. Speech Communication 27:175-185
Uitdenbogerd A, Zobel J (1999) Melodic Matching Techniques for Large Mu- sic Databases. Proc of the 7th ACM Intl Conference on Multimedia (part 1)
Aucouturier J J, Pachet F (2004) Improving Timbre Similarity: How High is the Sky. J Negative Results in Speech and Audio Sciences 1(1)

Audio Signal Processing

Sign up for access to the world's latest research

AbstractAI

Related papers

References (27)

Related papers

Related topics

Cited by

Abstract
AI