Audio Signal Processing
2008, Studies in Computational Intelligence
https://doi.org/10.1007/978-3-540-75398-8_8Abstract
AI
AI
This paper explores the field of audio signal processing, emphasizing the importance of auditory scene analysis and its applications in various domains such as speech recognition, music transcription, and multimedia data retrieval. It highlights key aspects of digital audio signal processing, including audio classification, compression methods, and the characteristics of audio signals crucial for effective processing. The discussion also covers time-frequency representations and feature extraction pertinent to audio classification systems.
References (27)
- Oppenheim A V and Lim J S, The Importance of Phase in Signals. Proc of the IEEE 69(5):529-550
- Moore BCJ (2003) An Introduction to the Psychology of Hearing. Academic Press, San Diego
- Patterson R D (2000) Auditory Images: How Complex Sounds Are Repre- sented in the Auditory System. J Acoust Soc Japan (E) 21(4)
- Lyon R F, Dyer L (1986) Experiments with a Computational Model of the Cochlea. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
- Martinez J M (2002) Standards -MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimedia 9(3):83-93
- Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2003) Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification. Proc of the Intl Conf on Multimedia and Expo (ICME)
- Wang L, Brown G (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE Press, New York
- McKinney M F, Breebaart J (2003) Features for Audio and Music Classifica- tion. Proc of the Intl Symp on Music Information Retrieval (ISMIR)
- Tzanetakis G, Cook P (2002) Musical Genre Classification of Audio Signals. IEEE Trans on Speech and Audio Processing 10(5):293-302
- Burred J J, Lerch A (2004) Hierarchical Automatic Audio Signal Classifica- tion. J Audio Engineering Society 52(7/8):724-739
- Logan B (2000) Mel frequency cepstral coefficients for music modeling. Proc of the Intl Symp on Music Information Retrieval (ISMIR)
- Zwicker E, Scharf B (1965) A Model of Loudness Summation. Psychological Review 72:3-26
- Klapuri A P (2005) A Perceptually Motivated Multiple-F0 Estimation Method for Polyphonic Music Signals. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPA)
- Duda R, Hart P, Stork D (2000) Pattern Classification. Wiley, New York
- El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music Discrimina- tion for Multimedia Applications. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
- Williams G, Ellis D (1999) Speech/music Discrimination based on Posterior Probability Features. Proc of Eurospeech
- Scheirer E, Slaney M (1997) Construction and Evaluation of a Robust Mul- tifeature Speech/Music Discriminator. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
- Chou W, Gu L (2001) Robust Singing Detection in Speech/Music Discrimina- tor Design. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
- Zhang T, Kuo C C J (2001) Audio Content Analysis for Online AudioVisual Data Segmentation and Classification. IEEE Trans on Speech and Audio Processing 9(4):441-457
- Wold E, Blum T, Keisler D, Wheaton J (1996) Content-based Classification, Search and Retrieval of Audio. IEEE Multimedia 3(3):27-36
- Peeters G, McAdams S, Herrera P (2000) Instrument Sound Description in the Context of MPEG-7. Proc of the Intl Computer Music Conference (ICMC)
- Dowling W J (1978) Scale and Contour: Two Components of a Theory of Memory for Melodies. Psychological Review 85:342-389
- Pradeep P, Joshi M, Hariharan S, Dutta-Roy S, Rao P (2007) Sung Note Segmentation for a Query-By-Humming System. Proc of the Intl Workshop on Artificial Intelligence and Music (Music-AI) in IJCAI
- Klapuri A P (1999) Sound Onset Detection by Applying Psychoacoustic Knowledge. Proc of the Intl Conf on Acoustics, Speech and Signal Processing (ICASSP)
- de Cheveigne A, Kawahara H (1999) Multiple period estimation and pitch perception model. Speech Communication 27:175-185
- Uitdenbogerd A, Zobel J (1999) Melodic Matching Techniques for Large Mu- sic Databases. Proc of the 7th ACM Intl Conference on Multimedia (part 1)
- Aucouturier J J, Pachet F (2004) Improving Timbre Similarity: How High is the Sky. J Negative Results in Speech and Audio Sciences 1(1)