Academia.eduAcademia.edu

Outline

Towards Cross-Version Harmonic Analysis of Music

2012, IEEE Transactions on Multimedia

https://doi.org/10.1109/TMM.2012.2190047

Abstract

For a given piece of music, there often exist multiple versions belonging to the symbolic (e.g., MIDI representations), acoustic (audio recordings), or visual (sheet music) domain. Each type of information allows for applying specialized, domain-specific approaches to music analysis tasks. In this paper, we formulate the idea of a cross-version analysis for comparing and/or combining analysis results from different representations. As an example, we realize this idea in the context of harmonic analysis to automatically evaluate MIDI-based chord labeling procedures using annotations given for corresponding audio recordings. To this end, one needs reliable synchronization procedures that automatically establish the musical relationship between the multiple versions of a given piece. This becomes a hard problem when there are significant local deviations in these versions. We introduce a novel late-fusion approach that combines different alignment procedures in order to identify reliable parts in synchronization results. Then, the cross-version comparison of the various chord labeling results is performed only on the basis of the reliable parts. Finally, we show how inconsistencies in these results across the different versions allow for a quantitative and qualitative evaluation, which not only indicates limitations of the employed chord labeling strategies but also deepens the understanding of the underlying music material.

References (49)

  1. G. Wiggins, E. Miranda, A. Smaill, and M. Harris, "A framework for the evaluation of music representation systems," Comput. Music J., vol. 17, no. 3, pp. 31-42, 1993.
  2. H. Fujihara, M. Goto, J. Ogata, and H. G. Okuno, "LyricSynchronizer: Automatic synchronization system between musical audio signals and lyrics," IEEE J. Select. Topics Signal Process., vol. 5, no. 6, pp. 1252-1261, 2011.
  3. M.-Y. Kan, Y. Wang, D. Iskandar, T. L. Nwe, and A. Shenoy, "Lyri- cAlly: Automatic synchronization of textual lyrics to acoustic music signals," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 2, pp. 338-349, 2008.
  4. R. Mayer and A. Rauber, "Musical genre classification by ensembles of audio and lyrics features," in Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), Miami, FL, 2011, pp. 675-680.
  5. C. McKay and I. Fujinaga, "Improving automatic music classifica- tion performance by extracting features from different types of data," in Proc. ACM SIGMM Int. Conf. Multimedia Information Retrieval, Philadelphia, PA, 2010, pp. 257-266.
  6. N. Hasler, B. Rosenhahn, T. Thormählen, M. Wand, J. Gall, and H.-P. Seidel, "Markerless motion capture with unsynchronized moving cam- eras," in Proc. IEEE Computer Society Conf. Computer Vision and Pat- tern Recognition (CVPR), Miami, FL, 2009, pp. 224-231.
  7. N. Hu, R. B. Dannenberg, and G. Tzanetakis, "Polyphonic audio matching and alignment for music retrieval," in Proc. IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, Oct. 2003.
  8. M. Müller and D. Appelt, "Path-constrained partial music synchro- nization," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, NV, 2008, pp. 65-68.
  9. S. Ewert, M. Müller, and R. B. Dannenberg, "Towards reliable partial music alignments using multiple synchronization strategies," in Proc. Int. Workshop Adaptive Multimedia Retrieval (AMR), Lecture Notes in Computer Science (LNCS), Madrid, Spain, 2009, vol. 6535, pp. 35-48.
  10. M. Müller, Information Retrieval for Music and Motion. New York: Springer-Verlag, 2007.
  11. E. Gómez, "Tonal description of music audio signals," Ph.D. disserta- tion, UPF, Barcelona, Spain, 2006.
  12. M. Müller, F. Kurth, and M. Clausen, "Audio matching via chroma- based statistical features," in Proc. Int. Conf. Music Information Re- trieval (ISMIR), 2005, pp. 288-295.
  13. J. Serrà, E. Gómez, P. Herrera, and X. Serra, "Chroma binary similarity and local alignment applied to cover song identification," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, pp. 1138-1151, Oct. 2008.
  14. J. Foote, "Visualizing music and audio using self-similarity," in Proc. ACM Int. Conf. Multimedia, Orlando, FL, 1999, pp. 77-80.
  15. M. Müller and F. Kurth, "Enhancing similarity matrices for music audio analysis," in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, 2006, pp. 437-440.
  16. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, ser. Prentice Hall Signal Processing Series, 1993.
  17. T. F. Smith and M. S. Waterman, "Identification of common molecular subsequences," J. Molec. Biol., vol. 147, pp. 195-197, 1981.
  18. P. A. Pevzner, Computational Molecular Biology: An Algorithmic Ap- proach. Cambridge, MA: MIT Press, 2000.
  19. V. Arifi, M. Clausen, F. Kurth, and M. Müller, "Synchronization of music data in score-, MIDI-and PCM-format," Comput. Musicol., vol. 13, pp. 9-33, 2004.
  20. R. B. Dannenberg, "An on-line algorithm for real-time accompani- ment," in Proc. Int. Computer Music Conf. (ICMC), 1984, pp. 193-198.
  21. C. Raphael, "A probabilistic expert system for automatic musical accompaniment," J. Computat. Graph. Statist., vol. 10, no. 3, pp. 487-512, 2001.
  22. R. B. Dannenberg and C. Raphael, "Music score alignment and com- puter accompaniment," Commun. ACM, Special Issue: Music Informa- tion Retrieval, vol. 49, no. 8, pp. 38-43, 2006.
  23. N. Orio, S. Lemouton, and D. Schwarz, "Score following: State of the art and new developments," in Proc. Int. Conf. New Interfaces for Mu- sical Expression (NIME), Montreal, QC, Canada, 2003, pp. 36-41.
  24. C. Raphael, "A hybrid graphical model for aligning polyphonic audio with musical scores," in Proc. Int. Conf. Music Information Retrieval (ISMIR), Barcelona, Spain, 2004, pp. 387-394.
  25. A. Cont, "A coupled duration-focused architecture for real-time music-to-score alignment," IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 6, pp. 974-987, 2010.
  26. M. Goto, H. Hashiguchi, T. Nishimura, and R. Oka, "RWC music data- base: Popular, classical and jazz music databases," in Proc. Int. Conf. Music Information Retrieval (ISMIR), Paris, France, 2002.
  27. S. Ewert, M. Müller, and P. Grosche, "High resolution audio synchro- nization using chroma onset features," in Proc. IEEE Int. Conf. Acous- tics, Speech, and Signal Processing (ICASSP), Taipei, Taiwan, 2009, pp. 1869-1872.
  28. A. Sheh and D. P. W. Ellis, "Chord segmentation and recognition using EM-trained hidden Markov models," in Proc. Int. Conf. Music Infor- mation Retrieval (ISMIR), Baltimore, MD, 2003.
  29. K. Lee and M. Slaney, "Acoustic chord transcription and key extraction from audio using key-dependent HMMs trained on synthesized audio," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 2, pp. 291-301, 2008.
  30. M. Mauch and S. Dixon, "Simultaneous estimation of chords and mu- sical context from audio," IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 6, pp. 1280-1289, 2010.
  31. M. Ryynänen and A. Klapuri, "Automatic transcription of melody, bass line, and chords in polyphonic music," Comput. Music J., vol. 32, no. 3, pp. 72-86, 2008.
  32. H. Papadopoulos and G. Peeters, "Simultaneous estimation of chord progression and downbeats from an audio file," in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), 2008, pp. 121-124.
  33. N. C. Maddage, "Automatic structure detection for popular music," IEEE Multimedia, vol. 13, no. 1, pp. 65-77, 2006.
  34. MIREX 2010. Audio Chord Estimation Subtask, Retrieved 17.09.2010. [Online]. Available: http://www.music-ir.org/mirex/ wiki/2010:Audio_Chord_Estimation.
  35. T. Winograd, "Linguistics and the computer analysis of tonal har- mony," J. Music Theory, vol. 12, pp. 2-49, 1968.
  36. J. H. Maxwell, "Understanding Music with AI," in An Expert System for Harmonic Analysis of Tonal Music. Cambridge, MA: MIT Press, 1992, pp. 335-353.
  37. D. Temperley, The Cognition of Basic Musical Structures. Cam- bridge, MA: MIT Press, 2001.
  38. D. Sleator and D. Temperley, The Melisma Music Analyzer, 2003. [On- line]. Available: http://www.link.cs.cmu.edu/music-analysis/.
  39. C. Raphael and J. Stoddard, "Functional harmonic analysis using prob- abilistic models," Comput. Music J., vol. 28, no. 3, pp. 45-52, 2004.
  40. R. Scholz and G. Ramalho, "COCHONUT: Recognizing complex chords from MIDI guitar sequences," in Proc. Int. Conf. Music Infor- mation Retrieval (ISMIR), 2008, pp. 27-32.
  41. B. Pardo and W. Birmingham, The Chordal Analysis of Tonal Music University of Michigan, Dept. of Electrical Engineering and Computer Science, Tech. Rep. CSE-TR-439-01, 2001.
  42. F. Lerdahl and R. Jackendoff, A Generative Theory of Tonal Music. Cambridge, MA: MIT Press, 1983.
  43. D. Temperley, Music and Probability. Cambridge, MA: MIT Press, 2007.
  44. C. Rhodes, D. Lewis, and D. Müllensiefen, "Bayesian model selection for harmonic labelling," in Proc. Int. Conf. Mathematics and Compu- tation in Music (MCM), Revised Selected Papers (Communications in Computer and Information Science). Springer, 2009, pp. 107-116.
  45. C. Harte, M. Sandler, S. Abdallah, and E. Gómez, "Symbolic represen- tation of musical chords: A proposed syntax for text annotations," in Proc. Int. Conf. Music Information Retrieval (ISMIR), London, U.K., 2005.
  46. V. Konz, M. Müller, and S. Ewert, "A multi-perspective evaluation framework for chord recognition," in Proc. 11th Int. Conf. Music Infor- mation Retrieval (ISMIR), Utrecht, The Netherlands, 2010, pp. 9-14.
  47. J. Woodruff, B. Pardo, and R. B. Dannenberg, "Remixing stereo music with score-informed source separation," in Proc. Int. Conf. Music In- formation Retrieval (ISMIR), 2006, pp. 314-319.
  48. K. Itoyama, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, "Instrument equalizer for query-by-example retrieval: Improving sound source separation based on integrated harmonic and inharmonic models," in Proc. Int. Conf. Music Information Retrieval (ISMIR), Philadelphia, PA, 2008, pp. 133-138.
  49. Y. Han and C. Raphael, "Informed source separation of orchestra and soloist," in Proc. Int. Society for Music Information Retrieval Conf. (ISMIR), Utrecht, The Netherlands, 2010, pp. 315-320.