Academia.eduAcademia.edu

Music Emotion Classification

description72 papers
group69 followers
lightbulbAbout this topic
Music Emotion Classification is the interdisciplinary study of identifying and categorizing the emotional content of music using computational methods, psychological theories, and music theory. It involves analyzing audio features, lyrics, and contextual factors to determine the emotions conveyed by musical pieces, facilitating applications in areas such as music recommendation systems and affective computing.
lightbulbAbout this topic
Music Emotion Classification is the interdisciplinary study of identifying and categorizing the emotional content of music using computational methods, psychological theories, and music theory. It involves analyzing audio features, lyrics, and contextual factors to determine the emotions conveyed by musical pieces, facilitating applications in areas such as music recommendation systems and affective computing.

Key research themes

1. How can dynamic, dimensionally-annotated datasets and benchmarks advance the evaluation and development of music emotion recognition systems?

This research area focuses on creating large, publicly available datasets that provide continuous, time-dependent emotional annotations (primarily in valence-arousal dimensions) for musical excerpts, enabling standardized benchmarking of music emotion recognition (MER) methods. Such datasets tackle challenges of data scarcity, copyright restrictions, and inconsistent annotation schemes, offering a foundation for systematic comparison of feature sets and algorithms in MER. The theme is crucial for developing robust MER systems that capture temporal emotion variations in music and for fostering reproducibility and comparability across studies.

Key finding: Introduced the MediaEval Database for Emotional Analysis in Music (DEAM), the largest dataset with continuous valence and arousal annotations at 2 Hz resolution over 1,802 Creative Commons songs, supporting dynamic music... Read more
Key finding: Created a sizable dataset of 903 audio clips labeled across five mood clusters aligned with MIREX standards, while using multiple audio feature extraction frameworks and support vector machine classifiers. Achieved an... Read more
Key finding: Proposed a biologically inspired cochlear modeling approach combined with convolutional neural networks to extract features from cochleogram images, aligning with human auditory perception. Evaluated on a public 1000-song... Read more

2. What are the effective machine learning approaches for multilabel and multimodal classification of emotions induced or perceived in music?

This theme investigates advanced machine learning methods capable of recognizing multiple simultaneous emotions in music, reflecting the complexity of human emotional responses. It encompasses multilabel classification paradigms, multimodal integration of audio and lyrics or video data, and the use of deep learning architectures like CNNs, LSTMs, and transformer models (e.g., XLNet). Addressing multilabel and multimodal approaches expands recognition accuracy and models emotional nuance, better reflecting real-world scenarios and enhancing applications such as music recommendation and emotion-based interaction.

Key finding: Analyzed Geneva Emotional Music Scale 9 annotations in the Emotify dataset using several machine learning algorithms for multilabel and multiclass classification, emphasizing simultaneous emotions. Findings informed... Read more
Key finding: Developed a multimodal MER system using mel spectrograms with CNN-LSTM for audio and XLNet transformers for lyrics, combining outputs via stacking ensemble and ANN meta-classifier. Achieved state-of-the-art 80.56% accuracy on... Read more
Key finding: Presented a hybrid emotion classification framework combining audio and video features extracted from the SAVEE database, using SVM for classification. The hybrid approach significantly improved accuracy to 99.26% compared to... Read more
Key finding: Reviewed and applied AI algorithms including SVM, RNN, and CNN on audio features such as pitch and Mel-frequency cepstral coefficients, illustrating deep learning’s superiority in modeling sequential emotional patterns from... Read more

3. How do music structural elements and compositional eras influence perceived musical emotions, and how can computational models incorporate these insights?

This theme explores how intrinsic musical features (e.g., tempo, mode, pitch patterns) and historical changes across musical eras affect emotional perception. It investigates score-based analyses combined with perceptual evaluations to reveal changing cue associations (e.g., between major/minor modes and emotional valence/arousal) from Classical to Romantic periods. Integrating these musicological insights with computational models enhances generation of emotionally expressive music and improves classification by accounting for temporal and cultural factors shaping emotional meaning.

Key finding: Combined score-based acoustic cue analyses with behavioral classification of Bach and Chopin excerpts, revealing that Romantic era compositions alter associations between musical mode and affective meanings compared to... Read more
Key finding: Developed EmotionBox, a deep neural network system generating symbolic music guided by music elements tempo and mode derived from music psychology, mapped onto emotional valence-arousal dimensions without requiring labeled... Read more
Key finding: Investigated various feature sets including audio signal processing, chord features, and EEG data to classify music emotion in valence-activation space. Found that combining music-inspired features, frequency modulation... Read more

All papers in Music Emotion Classification

In this paper the question of whether the Phrygian mode is always associated with perceived emotional responses of negative valence is looked into. To this end, we carried out a series of experiments. Music from two musical traditions... more
The perceptual attributes of timbre have inspired a considerable amount of multidisciplinary research, but because of the complexity of the phenomena, the approach has traditionally been confined to laboratory conditions, much to the... more
BACKGROUND The automatic prediction of emotional content in music is nowadays a growing area of interest. Several algorithms have been developed to retrieve music features and computational models using these features are continuously... more
Detecting emotion features in a song remains as a challenge in various area of research especially in Music Emotion Classification (MEC). In order to classify selected song with certain mood or emotion, the algorithms of the machine... more
Facial expressions are becoming more and more important in today's computer systems with humanoid user interfaces. Avatars have become popular, however their facial communication is usually limited. This is partly due to the fact that... more
In this paper the question of whether the Phrygian mode is always associated with perceived emotional responses of negative valence is looked into. To this end, we carried out a series of experiments. Music from two musical traditions... more
This paper proposes a dynamic Bayesian network (DBN) based MPEG-4 compliant 3D facial animation synthesis method driven by the (Evaluation, Activation) values in the continuous emotion space. For each emotion, a state synchronous DBN... more
BACKGROUND The automatic prediction of emotional content in music is nowadays a growing area of interest. Several algorithms have been developed to retrieve music features and computational models using these features are continuously... more
The matters regarding speech signal processing and analyzing in terms of emotional states recognition were presented in this paper. An experiment was conducted to perform both objective and subjective emotional states recognition tests... more
Facial skin is skin that protects the inside of the face such as the eyes, nose, mouth, and others. Facial skin consists of several types, including normal skin, oily skin, dry skin, and combination skin. This is a problem for women... more
Bu çalışmada, klasik makine öğrenme yöntemleri farklı kültürlere ait farklı türdeki müziklerden oluşmuş veri tabanları üzerinde duygu tanıması yapmak için kullanılmışlardır. Bu veri tabanlarında bulunan müziklerden öznitelik çıkarmak için... more
The ubiquity of digital music consumption has made it possible to extract information about modern music that allows us to perform large scale analysis of stylistic change over time. In order to uncover underlying patterns in cultural... more
A modern development in technology is Speech Emotion Recognition (SER). SER in partnership with Humane-Machine interaction (HMI) has advanced machine intelligence. An emotion precise HMI is designed by integrating speech processing and... more
Music appears to deeply affect emotional, cerebral and physiological states, and its effect on stress and anxiety has been established using a variety of self-report, physiological, and observational means. Yet, the relationship between... more
Günümüzde bilgisayar kullanımı yaygınlaştıkça insan-bilgisayar etkileşimi üzerine yenilikçi çalışmalar hız kazanmıştır. Bu yeniliklerden biri, insanların duygusal durumlarının bilgisayarlı sistemler tarafından belirlenmesidir. Bu... more
This paper presents the principal phase of extraction and recognition of the basic emotions in the Arabic speech applied to five emotional states were taken into effect; neutral, sadness, fear, anger and happiness. Emotional speech... more
Depression is the most prevalent mood disorder and a leading cause of disability worldwide. Automated video-based analyses may afford objective measures to support clinical judgments. In the present paper, categorical depression... more
Real time user independent facial expression recognition is important for virtual agents but challenging. However, since in real time recognition users are not necessarily presenting all the emotions, some proposed methods are not... more
Hampir setiap orang akan memperhatikan impresi busana yang dipakai, termasuk busana dengan motif batik. Namun, perpaduan berbagai motif dan warna batik memberikan impresi yang beragam. Sehingga, penentuan impresi dari satu kain batik... more
In recent decades computer technology has considerable developed in use of intelligent systems for classification. The development of HCI systems is highly depended on accurate understanding of emotions. However, facial expressions are... more
Beyin fonksiyonları ile ilgili olarak EEG işaretleri birçok bilgi içermektedir. EEG işaretlerinin dalga biçimleri diğer beyin işaretleri ile benzerlik göstermektedir. Bu çalışmada sunulan yöntemde, önce EEG işaretlerine öz bağlanımlı... more
• MUAP (Motor Unit Action Potential) clustering with hybrid structure • Use of multiple attribute vectors • Classification of neuromuscular diseases by artificial intelligence methods In this study, a classification structure consisting... more
Music has grown into an important part of people's daily lives. As we move further into the digital age in which a large collection of music is being created daily and becomes easily accessible renders people to spend more time on... more
Music has grown into an important part of people's daily lives. As we move further into the digital age in which a large collection of music is being created daily and becomes easily accessible renders people to spend more time on... more
It is widely acknowledged that music can communicate and induce a wide range of emotions in the listener. However, music is a highly-complex audio signal composed of a wide range of complex time- and frequency-varying components.... more
Despite theoretical claims that emotions are primarily communicated through prototypic facial expressions, empirical evidence is surprisingly scarce. This study aimed to: (1) test whether children produced more components of a prototypic... more
Öz Son yıllarda doğadan esinlenen sürü tabanlı algoritmalar arasında yer alan Salp Sürü Algoritması oldukça popüler olmuştur. Bu çalışmada, Salp Sürü Algoritması kullanılarak farklı veri setleri üzerinde öznitelik seçimi yapılmış, farklı... more
Provision of a new dataset to this field to work on it.  Showing the effects of noise normalization and preprocessing on the classification accuracy.  Representation of texts in vector form in various ways to be able to work on them. ... more
Öz: Endüstriyel ve akademik çalışmalarda objelerin ağırlıklarının ölçülmesi oldukça önemli bir yere sahiptir. Bu nedenle gerçekleştirilmiş olan bu çalışmada yapay sinir ağları (YSA) kullanılarak görüntü işlemeye dayalı uzaklıktan ve... more
Gender recognition from unconstrained face images is a challenging task due to the high degree of misalignment, pose, expression, and illumination variation. In previous works, the recognition of gender from unconstrained face images is... more
Music can be used to express a wide range of human emotions, from basic (e.g., pleasantness or unpleasantness dichotomies) to more complex emotions (e.g., transcendence or nostalgia). These emotions can be quantified by examining... more
Our lives are being significantly impacted by the rapid development of wireless technology and mobile gadgets on this day. The digital economy demands that services be developed almost instantly while also paying close attention to client... more
Author identification is one of the application areas of text mining. It deals with the automatic prediction of the potential author of an electronic text among predefined author candidates by using author specific writing styles. In this... more
This paper presents a novel age function modelling technique on the basis of the fusion of local features obtained by local texture descriptors. Initially, image normalization is performed and a feature extraction process is carried out.... more
Conditions of extreme neurological disability prevent any form of communication, even to show the emotional state. Brain Computer Interfaces (BCI) often use Electro-encephalography (EEG) measurements of the voluntary brain activity for... more
Stress and anxiety act as psycho-physical factors that increase the risk of developing several chronic diseases. Since they appear as early indicators, it is very important to be able to perform their evaluation in a contactless and... more
This work shows the development of a lexicon for a poorly resourced language, namely Kokborok. Kokborok is a regional language of North East India and offers an entirely new base for research in music information retrieval (MIR) field. We... more
Depression is the most prevalent mood disorder and a leading cause of disability worldwide. Automated video-based analyses may afford objective measures to support clinical judgments. In the present paper, categorical depression... more
Emotion classification is essential for understanding human interactions and hence is a vital component of behavioral studies. Although numerous algorithms have been developed, the emotion classification accuracy is still short of what is... more
The paper presents a study on music mood categorisation. First, a review of music mood models is presented. Then, the preparation of a set of music excerpts to be used in the experiments and music parametrisation is described. Next, some... more
This study reports experimental results on whether the acoustic realization of vocal emotions differs between Mandarin and English. Prosodic cues, spectral cues and articulatory cues generated by electroglottograph (EGG) of five emotions... more
Depression is the most prevalent mood disorder and a leading cause of disability worldwide. Automated video-based analyses may afford objective measures to support clinical judgments. In the present paper, categorical depression... more
In addition to classic motor signs and symptoms, individuals with Parkinson’s disease (PD) are characterized by emotional deficits. Ongoing brain activity can be recorded as electroencephalograph (EEG) to discover the links between... more
Öz Son yıllarda doğadan esinlenen sürü tabanlı algoritmalar arasında yer alan Salp Sürü Algoritması oldukça popüler olmuştur. Bu çalışmada, Salp Sürü Algoritması kullanılarak farklı veri setleri üzerinde öznitelik seçimi yapılmış, farklı... more
Günümüzde bilgisayar kullanımı yaygınlaştıkça insan-bilgisayar etkileşimi üzerine yenilikçi çalışmalar hız kazanmıştır. Bu yeniliklerden biri, insanların duygusal durumlarının bilgisayarlı sistemler tarafından belirlenmesidir. Bu... more
Depression is the most prevalent mood disorder and a leading cause of disability worldwide. Automated video-based analyses may afford objective measures to support clinical judgments. In the present paper, categorical depression... more
Facial skin is skin that protects the inside of the face such as the eyes, nose, mouth, and others. Facial skin consists of several types, including normal skin, oily skin, dry skin, and combination skin. This is a problem for women... more
Editorial on the Research Topic Recent advances in EEG (non-invasive) based BCI applications
by Ed Tan
Abstract. Expressions of emotion abound in user-generated content, whether it be in blogs, reviews, or on social media. Much work has been devoted to detecting and classifying these emotions, but little of it has acknowledged the fact... more
Download research papers for free!