A Distributed Model For Multiple-Viewpoint Melodic Prediction

Tillman Weyde

Outline

Natural Language Processing

A Distributed Model For Multiple-Viewpoint Melodic Prediction

Tillman Weyde

2013, International Symposium/Conference on Music Information Retrieval

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

A distributed model utilizing a Restricted Boltzmann Machine (RBM) has been proposed for melodic prediction in music analysis. This approach leverages multiple viewpoints to enhance predictive performance compared to traditional Markov models, focusing on the conditional distributions of melodic sequences. Evaluation is conducted on a diverse corpus of folk and chorale melodies, demonstrating the effectiveness of this method in knowledge extraction from musical data.

Tillman Weyde

2015 International Joint Conference on Neural Networks (IJCNN), 2015

We are interested in modelling musical pitch sequences in melodies in the symbolic form. The task here is to learn a model to predict the probability distribution over the various possible values of pitch of the next note in a melody, given those leading up to it. For this task, we propose the Recurrent Temporal Discriminative Restricted Boltzmann Machine (RTDRBM). It is obtained by carrying out discriminative learning and inference as put forward in the Discriminative RBM (DRBM), in a temporal setting by incorporating the recurrent structure of the Recurrent Temporal RBM (RTRBM). The model is evaluated on the cross entropy of its predictions using a corpus containing 8 datasets of folk and chorale melodies, and compared with n-grams and other standard connectionist models. Results show that the RTDRBM has a better predictive performance than the rest of the models, and that the improvement is statistically significant.

downloadDownload free PDF View PDFchevron_right

Comparing Probabilistic Models for Melodic Sequences

Athina Spiliopoulou

Lecture Notes in Computer Science, 2011

Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.

downloadDownload free PDF View PDFchevron_right

A deep learning method for melody extraction from a polyphonic symbolic music representation

Pierre Chanquion

CERN European Organization for Nuclear Research - Zenodo, 2022

downloadDownload free PDF View PDFchevron_right

A Study On Lstm Networks For Polyphonic Music Sequence Modelling

Emmanouil Benetos

2017

Neural networks, and especially long short-term memory networks (LSTM), have become increasingly popular for sequence modelling, be it in text, speech, or music. In this paper, we investigate the predictive power of simple LSTM networks for polyphonic MIDI sequences, using an empirical approach. Such systems can then be used as a music language model which, combined with an acoustic model, can improve automatic music transcription (AMT) performance. As a first step, we experiment with synthetic MIDI data, and we compare the results obtained in various settings, throughout the training process. In particular, we compare the use of a fixed sample rate against a musically-relevant sample rate. We test this system both on synthetic and real MIDI data. Results are compared in terms of note prediction accuracy. We show that the higher the sample rate is, the better the prediction is, because self transitions are more frequent. We suggest that for AMT, a musically-relevant sample rate is crucial in order to model note transitions, beyond a simple smoothing effect.

downloadDownload free PDF View PDFchevron_right

RESEARCH ARTICLE Predictive Models for Music

egrvbvfd fdbrfdd

2008

to achieve with traditional machine learning methods. This problem occurs when considering music data. In this paper, we introduce predictive models for melodies. We decompose melodic modeling into two subtasks. We first propose a rhythm model based on the distributions of distances between subsequences. Then, we define a generative model for melodies given chords and rhythms based on modeling sequences of Narmour features. The rhythm model consistently outperforms a standard Hidden Markov Model in terms of conditional prediction accuracy on two dierent music databases. Using a similar evaluation procedure, the proposed melodic model consistently outperforms an Input/Output Hidden Markov Model. Furthermore, these models are able to generate realistic melodies given appropriate musical contexts.

downloadDownload free PDF View PDFchevron_right

Musical Deep Learning: Stylistic Melodic Generation with Complexity Based Similarity

Ben D Smith

The wide-ranging impact of deep learning models implies significant application in music analysis, retrieval , and generation. Initial findings from musical application of a conditional restricted Boltzmann machine (CRBM) show promise towards informing creative computation. Taking advantage of the CRBM's ability to model temporal dependencies full reconstructions of pieces are achievable given a few starting seed notes. The generation of new material using figuration from the training corpus requires restrictions on the size and memory space of the CRBM, forcing associative rather than perfect recall. Musical analysis and information complexity measures show the musical encoding to be the primary determinant of the nature of the generated results.

downloadDownload free PDF View PDFchevron_right

Multiple viewpoint systems for music prediction

Ian Witten

Journal of New Music Research, 1995

This paper examines the prediction and generation of music using a multiple viewpoint system, a collection of independent views of the musical surface each of which models a specific type of musical phenomena. Both the general style and a particular piece are modeled using dual short-term and long-term theories, and the model is created using machine learning techniques on a corpus of musical examples.

downloadDownload free PDF View PDFchevron_right

Evaluating a Collection of Sound-Tracing Data of Melodic Phrases

Alexander R Jensenius

2018

Melodic contour, the ‘shape’ of a melody, is a common way to visualize and remember a musical piece. The purpose of this paper is to explore the building blocks of a future ‘gesture-based’ melody retrieval system. We present a dataset containing 16 melodic phrases from four musical styles and with a large range of contour variability. This is accompanied by full-body motion capture data of 26 participants performing sound-tracing to the melodies. The dataset is analyzed using canonical correlation analysis (CCA), and its neural network variant (Deep CCA), to understand how melodic contours and sound tracings relate to each other. The analyses reveal non-linear relationships between sound and motion. The link between pitch and verticality does not appear strong enough for complex melodies. We also find that descending melodic contours have the least correlation with tracings.

downloadDownload free PDF View PDFchevron_right

Modeling Temporal Tonal Relations in Polyphonic Music Through Deep Networks with a Novel Image-Based Representation

Ching-Hua Chuan, Dorien Herremans

The Thirty-Second AAAI Conference on Artificial Intelligence., 2018

We propose an end-to-end approach for modeling polyphonic music with a novel graphical representation, based on music theory, in a deep neural network. Despite the success of deep learning in various applications, it remains a challenge to incorporate existing domain knowledge in a network without affecting its training routines. In this paper we present a novel approach for predictive music modeling and music generation that incorporates domain knowledge in its representation. In this work, music is transformed into a 2D representation, inspired by tonnetz from music theory, which graphically encodes musical relationships between pitches. This representation is incorporated in a deep network structure consisting of multilayered convolutional neural networks (CNN, for learning an efficient abstract encoding of the representation) and recurrent neural networks with long short-term memory cells (LSTM, for capturing temporal dependencies in music sequences). We empirically evaluate the nature and the effectiveness of the network by using a dataset of classical music from various composers. We investigate the effect of parameters including the number of convolution feature maps, pooling strategies, and three configurations of the network: LSTM without CNN, LSTM with CNN (pre-trained vs. not pre-trained). Visualizations of the feature maps and filters in the CNN are explored, and a comparison is made between the proposed tonnetz-inspired representation and pianoroll, a commonly used representation of music in computational systems. Experimental results show that the tonnetz representation produces musical sequences that are more tonally stable and contain more repeated patterns than sequences generated by pianoroll-based models, a finding that is directly useful for tackling current challenges in music and AI such as smart music generation.

downloadDownload free PDF View PDFchevron_right

A Holistic Approach to Polyphonic Music Transcription with Neural Networks

Miguel Roman

2019

We present a framework based on neural networks to extract music scores directly from polyphonic audio in an end-to-end fashion. Most previous Automatic Music Transcription (AMT) methods seek a piano-roll representation of the pitches, that can be further transformed into a score by incorporating tempo estimation, beat tracking, key estimation or rhythm quantization. Unlike these methods, our approach generates music notation directly from the input audio in a single stage. For this, we use a Convolutional Recurrent Neural Network (CRNN) with Connectionist Temporal Classification (CTC) loss function which does not require annotated alignments of audio frames with the score rhythmic information. We trained our model using as input Haydn, Mozart, and Beethoven string quartets and Bach chorales synthesized with different tempos and expressive performances. The output is a textual representation of four-voice music scores based on **kern format. Although the proposed approach is evaluat...

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Tillman Weyde

International Symposium/Conference on Music Information Retrieval, 2014

The multiple viewpoints representation is an event-based representation of symbolic music data which offers a means for the analysis and generation of notated music. Previous work using this representation has predominantly relied on n-gram and variable order Markov models for music sequence modelling. Recently the efficacy of a class of distributed models, namely restricted Boltzmann machines, was demonstrated for this purpose. In this paper, we demonstrate the use of two neural network models which use fixed-length sequences of various viewpoint types as input to predict the pitch of the next note in the sequence. The predictive performance of each of these models is comparable to that of models previously evaluated on the same task. We then combine the predictions of individual models using an entropy-weighted combination scheme to improve the overall prediction performance, and compare this with the predictions of a single equivalent model which takes as input all the viewpoint types of each of the individual models in the combination.

downloadDownload free PDF View PDFchevron_right

Learning Distributed Representations for Multiple-Viewpoint Melodic Prediction

Tillman Weyde

2013

The analysis of sequences is important for extracting information from music owing to its fundamentally temporal nature. In this paper, we present a distributed model based on the Restricted Boltzmann Machine (RBM) for melodic sequences. The model is similar to a previous successful neural network model for natural language [2]. It is first trained to predict the next pitch in a given pitch sequence, and then extended to also make use of information in sequences of note-durations in monophonic melodies on the same task. In doing so, we also propose an efficient way of representing this additional information that takes advantage of the RBM's structure. Results show that this RBM-based prediction model performs better than previously evaluated n-gram models in many cases. It is able to make use of information present in longer contexts more effectively than n-gram models, while scaling linearly in the number of free parameters required.

downloadDownload free PDF View PDFchevron_right

A Neural Probabilistic Model for Predicting Melodic Sequences

Tillman Weyde

We present an approach for modelling melodic sequences using Restricted Boltzmann Machines, with an application to folk melody classification. 1Results show that this model's predictive performance is slightly better in our experiment than that of previously evaluated ngram models . The model has a simple structure and in our evaluation it scaled linearly in the number of free parameters with length of the modelled context. A set of these models is used to classify 7 different styles of folk melodies with an accuracy of 61.74%.

downloadDownload free PDF View PDFchevron_right

Hybrid Long- And Short-Term Models Of Folk Melodies

Tillman Weyde

2015

In this paper, we present the results of a study on dynamic models for predicting sequences of musical pitch in melodies. Such models predict a probability distribution over the possible values of the next pitch in a sequence, which is obtained by combining the prediction of two components (1) a long-term model (LTM) learned offline on a corpus of melodies, as well as (2) a short-term model (STM) which incorporates context-specific information available during prediction. Both the LTM and the STM learn regularities in pitch sequences solely from data. The models are combined in an ensemble, wherein they are weighted by the relative entropies of their respective predictions. Going by previous work that demonstrates the success of Connectionist LTMs, we employ the recently proposed Recurrent Temporal Discriminative Restricted Boltzmann Machine (RTDRBM) as the LTM here. While it is indeed possible for the same model to also serve as an STM, our experiments showed that n-gram models tended to learn faster than the RTDRBM in an online setting and that the hybrid of an RTDRBM LTM and an n-gram STM gives us the best predictive performance yet on a corpus of monophonic chorale and folk melodies.

downloadDownload free PDF View PDFchevron_right

Probabilistic models for melodic prediction

egrvbvfd fdbrfdd

Artificial Intelligence, 2009

Chord progressions are the building blocks from which tonal music is constructed. The choice of a particular representation for chords has a strong impact on statistical modeling of the dependence between chord symbols and the actual sequences of notes in polyphonic music. Melodic prediction is used in this paper as a benchmark task to evaluate the quality of four chord representations using two probabilistic model architectures derived from Input/Output Hidden Markov Models (IOHMMs). Likelihoods and conditional and unconditional prediction error rates are used as complementary measures of the quality of each of the proposed chord representations. We observe empirically that different chord representations are optimal depending on the chosen evaluation metric. Also, representing chords only by their roots appears to be a good compromise in most of the reported experiments.

downloadDownload free PDF View PDFchevron_right

A Comparative Study of Neural Models for Polyphonic Music Sequence Transduction

Emmanouil Benetos

2019

Automatic transcription of polyphonic music remains a challenging task in the field of Music Information Retrieval. One under-investigated point is the post-processing of time-pitch posteriograms into binary piano rolls. In this study, we investigate this task using a variety of neural network models and training procedures. We introduce an adversarial framework, that we compare against more traditional training losses. We also propose the use of binary neuron outputs and compare them to the usual real-valued outputs in both training frameworks. This allows us to train networks directly using the F-measure as training objective. We evaluate these methods using two kinds of transduction networks and two different multi-pitch detection systems, and compare the results against baseline note-tracking methods on a dataset of classical piano music. Analysis of results indicates that (1) convolutional models improve results over baseline models, but no improvement is reported for recurrent...

downloadDownload free PDF View PDFchevron_right

Sequence Generation using Deep Recurrent Networks and Embeddings: A study case in music

Juan Lalinde

ArXiv, 2020

Automatic generation of sequences has been a highly explored field in the last years. In particular, natural language processing and automatic music composition have gained importance due to the recent advances in machine learning and Neural Networks with intrinsic memory mechanisms such as Recurrent Neural Networks. This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition. The proposed approach considers music theory concepts such as transposition, and uses data transformations (embeddings) to introduce semantic meaning and improve the quality of the generated melodies. A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically, measuring the tonality of the musical compositions.

downloadDownload free PDF View PDFchevron_right

Melody Extraction from Polyphonic Music by Deep Learning Approaches: A Review

Parthapratim Das

arXiv (Cornell University), 2022

downloadDownload free PDF View PDFchevron_right

9. The Musical Object in Deep Learning

Odd Torleiv Furnes

Open Book Publishers, 2024

of the images and other media included in this publication differ from the above. This information is provided in the captions and in the list of illustrations and media examples. Every effort has been made to identify and contact copyright holders and any omission or error will be corrected if notification is made to the publisher. All external links were active at the time of publication unless otherwise stated and have been archived via the Internet Archive Wayback Machine at .

downloadDownload free PDF View PDFchevron_right

Melodic models for polyphonic music classification

Bernard Manderick

2009

Abstract. The classification of polyphonic music still presents challenges for current music data mining methods. In this paper we explore the performance of classifiers specifically created for melody on the polyphonic classification task. On a small dataset of string quartet movements of Haydn and Mozart, the melodic n-gram model outperforms the melodic global feature model for composer recognition.

downloadDownload free PDF View PDFchevron_right

A Distributed Model For Multiple-Viewpoint Melodic Prediction

Sign up for access to the world's latest research

AbstractAI

Related papers

Related papers

Related topics

Abstract
AI