Papers by George Tzanetakis

Zenodo (CERN European Organization for Nuclear Research), Oct 24, 2022
Sonification can provide valuable insights about data but most existing approaches are not design... more Sonification can provide valuable insights about data but most existing approaches are not designed to be controlled by the user in an interactive fashion. Interactions enable the designer of the sonification to more rapidly experiment with sound design and allow the sonification to be modified in real-time by interacting with various control parameters. In this paper, we describe two case studies of interactive sonification that utilize publicly available datasets that have been described recently in the International Conference on Auditory Display (ICAD). They are from the health and energy domains: electroencephalogram (EEG) alpha wave data and air pollutant data consisting of nitrogen dioxide, sulfur dioxide, carbon monoxide, and ozone. We show how these sonfications can be recreated to support interaction utilizing a general interactive sonification framework built using ChucK, Unity, and Chunity. In addition to supporting typical sonification methods that are common in existing sonification toolkits, our framework introduces novel methods such as supporting discrete events, interleaved playback of multiple data streams for comparison, and using frequency modulation (FM) synthesis in terms of one data attribute modulating another. We also describe how these new functionalities can be used to improve the sonification experience of the two datasets we have investigated.

Traditionally work on multimedia structuring has been centered on the creation of indices and the... more Traditionally work on multimedia structuring has been centered on the creation of indices and their use for searching. Although searching is important there are many cases where the user just wants to browse through the data to find something interesting without having any particular search goal. Multimedia data exhibits hierarchical structure that can be exploited for more natural user interaction with the content. In order to handle the large amounts of multimedia data more structure than what is currently available is required. In this paper, we have focused on structuring multimedia data using trees to describe both temporal and categorical relations. The pervasive use of trees to express hierarchies facilitates browsing, profiling, and authoring. Our main target application is the implementation of a personalized TV-guide. The constraints imposed by this application caused the development of a new simple compact graphical user interface for tree browsing.

Proceedings of the International Symposium on Computational Aesthetics in Graphics, Visualization, and Imaging, 2011
The creation of expressive styles for digital art is one of the primary goals in non-photorealist... more The creation of expressive styles for digital art is one of the primary goals in non-photorealistic rendering. In this paper, we introduce a swarm-based multi-agent system that is capable of producing expressive imagery through the use of multiple digital images. At birth, agents in our system are assigned a digital image that represents their 'aesthetic ideal'. As agents move throughout a digital canvas they try to 'realize' their ideal by modifying the pixels in the digital canvas to be closer to the pixels in their aesthetic ideal. When groups of agents with different aesthetic ideals occupy the same canvas, a new image is created through the convergence of their conflicting aesthetic goals. We use our system to explore the concepts and techniques from a number of Modern Art movements. The simple implementation and effective results produced by our system makes a compelling argument for more research using swarm-based multi-agent systems for non-photorealistic rending.
Tutorial: MIR for Audio Signals

2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
Recently there has been an increasing amount of work in the area of automatic genre classificatio... more Recently there has been an increasing amount of work in the area of automatic genre classification of music in audio format. Such systems can be used as a way to evaluate features describing musical content as well as a way to structure large collections of music. However the evaluation and comparison of genre classification systems is hindered by the subjective perception of genre definitions by users. In this work we describe a set of experiments in automatic musical genre classification. An important contribution of this work is the comparison of the automatic results with human genre classification on the same dataset. The results show that, although there is significant room for improvement, genre classification is inherently subjective and therefore perfect results can not be expected from either automatic algorithms or human annotation. The experiments also show that the use of features derived from an auditory model have similar performance with features based on Mel-Frequency Cepstral Coefficients (MFCC).

Lecture Notes in Computer Science, 2003
Increases in hard disk capacity and audio compression technology have enabled the storage of larg... more Increases in hard disk capacity and audio compression technology have enabled the storage of large collections of music on personal computers and portable devices. As an example a portable device with 20 Gigabytes of storage can hold up to 4000 songs in compressed audio format. Currently the only way of structuring these collections is using a file system hierarchy which allows very limited forms of searching and retrieval. These limitations are even more pronounced in the case of portable devices where there is less screen real estate and user attention is limited compared to a personal computer. Musescape is a prototype tool for organizing and interacting with large music collections in audio format with specific emphasis on portable devices. It provides a variety of automatic and manual ways to organize and interact with large music collections using a consistent continuous audio feedback user interface for browsing, searching and annotating. Using this system a user can convert an unstructured or partially structured collection of music with limited retrieval capabilities into a music library with enhanced functionality.
Intelligent Music Information Systems
Marsyas, is an open source audio processing framework with specific emphasis on building Music In... more Marsyas, is an open source audio processing framework with specific emphasis on building Music Information Retrieval systems. It has been been under development since 1998 and has been used for a variety of projects in both academia and industry. In this chapter, the software architecture of Marsyas will be described. The goal is to highlight design challenges and solutions that are relevant to any MIR software. Keywords: Information Processing, Music, Information Retrieval, System Design, Evaluation, Fast Fourier Transfer (FFT), Feature Extraction, MFCC

PACRIM. 2005 IEEE Pacific Rim Conference on Communications, Computers and signal Processing, 2005.
Annotation of audio content is an important component of modern multimedia information retrieval ... more Annotation of audio content is an important component of modern multimedia information retrieval systems. Automatic gender identification is used for video indexing and can improve speech recognition results by using genderspecific classifiers. Gender identification in large datasets is difficult because of the large variability in speaker characteristics. Bootstrapping is an approach that attempts to combine minimal user annotations with automatic techniques for audio classification. In bootstrapping a small random sampling of the training data is annotated by the user and this annotation is used to train a classifier that annotates the remaining data. This technique is useful when the training set is too large to be fully annotated by the user. Experimental results showing that bootstrapping is effective for automatic audio-based gender identification are provided.
Proceedings of the Fifth International Conference on the Foundations of Digital Games, 2010
Music can significantly effect game play and help players understand underlying patterns in the g... more Music can significantly effect game play and help players understand underlying patterns in the game, or the effects of their actions on the characters. Conversely, inappropriate music can have a negative effect on players by creating additional difficulties. While game makers recognize the effects of music on game play, solutions that provide users with a choice in personal music are not forthcoming. We design, implement and evaluate an algorithm for automatically adapting an arbitrary music track from a personal library and synchronizing play back to the user, without requiring any access to the video game source code.
Music Mining
Academic Press Library in Signal Processing, 2014
The multi-faceted nature of music information requires sophisticated algorithms and systems, that... more The multi-faceted nature of music information requires sophisticated algorithms and systems, that combine signal processing and machine learning techniques, in order to extract useful information from the large collections of music available today. This chapter overviews work in music mining which is the application of data mining techniques for the purposes of music processing. Topics covered include content-based similarity retrieval, genre classification, emotion/mood classification, music clustering, automatic tag annotation, audio fingerprinting, cover song detection as well as self-organizing maps and visualization. Open problems and future trends as well as pointers for further reading are also provided.

Combining prior-knowledge and grouping cues using a spectral clustering approach
The Journal of the Acoustical Society of America, 2008
Learning happens at the boundary interactions beween prior knowledge and incoming data. The same ... more Learning happens at the boundary interactions beween prior knowledge and incoming data. The same interplay takes place when trying to analyze and separate complex mixtures of sound sources such as music. Many approaches to this problem can be broadly categorized as either model based or grouping based. Although it is known that our perceptual system utilizes both of these types of processing, building such systems computationally has been challenging. As a result most existing systems either rely on prior source models or are solely based on grouping cues. In this work it is argued that formulating this integration problem as clustering based on similarities between time-frequency atoms provides an expressive but discipined approach to building sound source characterization and separation systems and to evaluating their performance. After describing the main components of such an architecture, we describe a concrete realization that is based on spectral clustering of a sinusoidal re...
Early Experiences and Challenges in Building and Using A Scalable Display Wall
MIREX 2007—Music Information Retrieval Evaluation …, 2007
Marsyas is an open source software framework for au-dio analysis, synthesis and retrieval with sp... more Marsyas is an open source software framework for au-dio analysis, synthesis and retrieval with specific empha-sis on Music Information Retrieval. It is developed by an international team of programmers and researchers led by George Tzanetakis. In MIREX 2007 we participated ...
Marsyas is an open source software framework for audio analysis, synthesis and retrieval with spe... more Marsyas is an open source software framework for audio analysis, synthesis and retrieval with specific emphasis on Music Information Retrieval. It is developed by an international team of programmers and researchers led by George Tzanetakis. In MIREX 2012 the Marsyas team participated in the following tasks that we have participated in the past: Audio Classical Composer Identification, Audio Genre Classification (Latin and Mixed), Audio Music Mood Classification, Audio Music Similarity and Retrieval, and Audio ...
Method and system for analyzing ditigal audio files
A method and system for analyzing audio files is provided. Plural audio file feature vector value... more A method and system for analyzing audio files is provided. Plural audio file feature vector values based on an audio file's content are determined and the audio file feature vectors are stored in a database that also stores other pre-computed audio file features. The process determines if the audio files feature vectors match the stored audio file vectors. The process also associates a plurality of known attributes to the audio file.
Proceedings of the 1st Music Information Retrieval Evaluation eXchange (MIREX 2005), 2005
This abstract describes the tempo extraction algorithm used for the University of Victoria submis... more This abstract describes the tempo extraction algorithm used for the University of Victoria submission to the MIREX (Music Information Retrieval Exchange) 2005. The algorithm is mostly based on self-similarity rather than onset detection. However, an onset detection component is used to calculate the phase of the dominant periodicities. Multiple frequency bands are calculated using a Discrete Wavelet Transform. Subsequently the envelope of each band is extracted and autocorrelation is used to find the dominant ...
MUSESCAPE: An interactive content-aware music browser
Proc. Conference on Digital Audio Effects (DAFX), Sep 8, 2003
Advances in hardware performance, network bandwidth and audio compression have made possible the ... more Advances in hardware performance, network bandwidth and audio compression have made possible the creation of large personal digital music collections. Although, there is a significant body of work in image and video browsing, there has been little work that directly addresses the problem of audio and especially music browsing. In this paper, Musescape, a prototype music browsing system is described and evaluated. The main characteristics of the system are automatic configuration based on Computer ...
Method and system for analyzing digital audio files
A method and system for analyzing audio files is provided. Plural audio file feature vector value... more A method and system for analyzing audio files is provided. Plural audio file feature vector values based on an audio file's content are determined and the audio file feature vectors are stored in a database that also stores other pre-computed audio file features. The process determines if the audio files feature vectors match the stored audio file vectors. The process also associates a plurality of known attributes to the audio file.
Proceedings of the Music Information Retrieval Evaluation EXchange, 2009
Marsyas is an open source software framework for audio analysis, synthesis and retrieval with spe... more Marsyas is an open source software framework for audio analysis, synthesis and retrieval with specific emphasis on Music Information Retrieval. It is developed by an international team of programmers and researchers led by George Tzanetakis. In MIREX 2009 the Marsyas team participated in the following tasks: Audio Classical Composer Identification, Audio Genre Classification (Latin and Mixed), Audio Music Mood Classification, Audio Beat Tracking, Audio Onset Detection, Audio Music Similarity and ...
Proc. COSTG6 Conference on Digital Audio …
Most of the current tools for working with sound work on sin- gle soundfiles, use 2D graphics and... more Most of the current tools for working with sound work on sin- gle soundfiles, use 2D graphics and offer limited interaction to the user. In this paper we describe a set of tools for working with col- lections of sounds that are based on interactive 3D graphics. These tools form two ...
Uploads
Papers by George Tzanetakis