Academia.eduAcademia.edu

Emotion Recognition from Speech

description33 papers
group10 followers
lightbulbAbout this topic
Emotion recognition from speech is a multidisciplinary field that involves the analysis of vocal characteristics, such as tone, pitch, and rhythm, to identify and interpret the emotional state of a speaker. This research area combines elements of linguistics, psychology, and artificial intelligence to enhance human-computer interaction and improve communication technologies.
lightbulbAbout this topic
Emotion recognition from speech is a multidisciplinary field that involves the analysis of vocal characteristics, such as tone, pitch, and rhythm, to identify and interpret the emotional state of a speaker. This research area combines elements of linguistics, psychology, and artificial intelligence to enhance human-computer interaction and improve communication technologies.
Emotional expressions are a fundamental aspect of human communication, with speech being one of the most natural modes of interaction. Speech Emotion Recognition (SER) is a significant research topic in Natural Language Processing (NLP),... more
Recent advancements in voice conversion systems have been largely driven by deep learning techniques, enabling the high-quality synthesis of human speech. However, existing models often fail to generate emotionally expressive speech,... more
This paper presents a zero-shot learning framework for speech-based emotion recognition across languages using contrastive learning. Traditional emotion AI models often depend on large, labeled corpora in a specific language, limiting... more
In today's competitive landscape, businesses grapple with customer retention. Churn prediction models, although beneficial, often lack accuracy due to the reliance on a single data source. The intricate nature of human behavior and... more
Speech Emotion Recognition (SER) affective technology enables the intelligent embedded devices to interact with sensitivity. Similarly, call centre employees recognise customers' emotions from their pitch, energy, and tone of voice so as... more
Knowing the incidence of, breast cancer, diagnosis, and treatment methods given a strategic approach for community awareness and rapid management. This study was aimed to: Estimate the demographic pattern including age, marital status,... more
Speech Emotion Recognition (SER) is a method where computers learn to recognize human emotions from speech to improve communication. In this study, we present an innovative Bangla SER framework, incorporating data augmentations, feature... more
Human speech can be characterized by different components, including semantic content, speaker identity and prosodic information. Significant progress has been made in disentangling representations for semantic content and speaker... more
In the article we evaluate the importance of different HMM states in an HMM-based feature extraction method used to model paralinguistic information. Specifically, we evaluate the distribution of the paralinguistic information across... more
Speech Emotion Recognition is an essential research analysis for utilizing interconnection with embedded systems to introduce human interference technology. In this study, we proposed a model to recognize emotion from speech using the... more
Identity of a person via voice is one of the most interesting techniques used for user identification. Accuracy of identification process depends on the number of feature vectors and the number of speakers. This paper aims to develop a... more
Modern interactive means combined with new digital media processing and representation technologies can provide a robust framework for enhancing user experience in multimedia entertainment systems and audiovisual artistic installations... more
The paper presents our efforts in the Interspeech 2011 Speaker State Challenge. Both systems, for the Intoxication and the Sleepiness Sub-Challenge, are based on a Universal Background Model (UBM) in a form of a Hidden Markov Model (HMM),... more
Modern interactive means combined with new digital media processing and representation technologies can provide a robust framework for enhancing user experience in multimedia entertainment systems and audiovisual artistic installations... more
Identity of a person via voice is one of the most interesting techniques used for user identification. Accuracy of identification process depends on the number of feature vectors and the number of speakers. This paper aims to develop a... more
Knowing the incidence of, breast cancer, diagnosis, and treatment methods given a strategic approach for community awareness and rapid management. This study was aimed to: Estimate the demographic pattern including age, marital status,... more
In today's competitive landscape, businesses grapple with customer retention. Churn prediction models, although beneficial, often lack accuracy due to the reliance on a single data source. The intricate nature of human behavior and... more
The last few decades have seen a wide range of research projects focusing on automatic emotion recognition based on speech for human-machine communication, Speech is the most fundamental and natural means of communication while... more
Speech Emotion Recognition (SER) affective technology enables the intelligent embedded devices to interact with sensitivity. Similarly, call centre employees recognise customers' emotions from their pitch, energy, and tone of voice so as... more
In this paper methodology for human emotion recognizes by extracting the speech signal. This speaker-based emotion recognition system recognizes the four emotions namely happiness, sadness, fear and angry. Basically, aim of this system to... more
Obtaining speech samples is an attractive non-invasive method to recognize alcohol intoxication. In this paper, we aim to improve accuracy of speech-based intoxication recognition by decision fusion of utterance-level classifiers. On the... more
Identification of singers is considered an important research area in audio signal processing. It has acquired the scientist's intrigues in two primary branches,1) recognizing vocal parts of polyphonic music, and 2) Classifying Singer.... more
This paper focuses on the automatic detection of a person's blood level alcohol based on automatic speech processing approaches. We compare 5 different feature types with different ways of modeling. Experiments are based on the ALC corpus... more
In this paper we describe our methodology for automatic detection of speaker alcoholization. Our task is restricted to detection of considerable alcoholization (alcohol blood level ≥ 0.8 per mille), so that a two-class classification... more
In the article we evaluate the importance of different HMM states in an HMM-based feature extraction method used to model paralinguistic information. Specifically, we evaluate the distribution of the paralinguistic information across... more
In the article we evaluate different techniques of acoustic modeling for speech recognition in the case of limited audio resources. The objective was to build different sets of acoustic models, the first was trained on a small set of... more
Over the past several decades, numerous speech enhancement techniques have been proposed to improve the performance of modern communication devices in noisy environments. Among them, there is a large range of classical algorithms (e.g.... more
Background: Breast cancer is the most commonly identified dangerous cancer in females. While breast cancer has been recorded to be a source of female mortality in many developing countries, studies have shown that bronchogenic carcinoma... more
This paper describes a speech database built from 17 Slovenian radio dramas. The dramas were obtained from the national radio-and-television station (RTV Slovenia) and were given at the universities disposal with an academic license for... more
Modern interactive means combined with new digital media processing and representation technologies can provide a robust framework for enhancing user experience in multimedia entertainment systems and audiovisual artistic installations... more
Over the past several decades, numerous speech enhancement techniques have been proposed to improve the performance of modern communication devices in noisy environments. Among them, there is a large range of classical algorithms (e.g.... more
Modern interactive means combined with new digital media processing and representation technologies can provide a robust framework for enhancing user experience in multimedia entertainment systems and audiovisual artistic installations... more
In this paper we describe our methodology for automatic detection of speaker alcoholization. Our task is restricted to detection of considerable alcoholization (alcohol blood level ≥ 0.8 per mille), so that a two-class classification... more
Emotion Speech Recognition (ESR) is recognizing the formation and change of speaker’s emotional state from his/her speech signal. The main purpose of this field is to produce a convenient system that is able to effortlessly communicate... more
Modern deep learning architectures are ordinarily performed in high performance computing facilities due to the large size of their input features and complexity of their models. This paper proposes traditional multilayer perceptrons... more
This paper presents a novel artificial bandwidth extension (ABE) framework based on deep neural networks (DNNs) with a multiple-layer’s deep architecture. It demonstrates the suitability of DNNs for modeling log power spectra of speech... more
The chief point of this paper is to supply an outline of Speech Emotion Recognition. Emotions can be recognized by extracting many features from the speech. In SERs, numerous methods have been resorted to remove sentiments from waves,... more
In the article we evaluate the importance of different HMM states in an HMM-based feature extraction method used to model paralinguistic information. Specifically, we evaluate the distribution of the paralinguistic information across... more
emotion recognition that can easily be integrated into various systems, such as humanoid robots, smart surveillance systems and alike.
The paper presents our efforts in the Interspeech 2011 Speaker State Challenge. Both systems, for the Intoxication and the Sleepiness Sub-Challenge, are based on a Universal Background Model (UBM) in a form of a Hidden Markov Model (HMM),... more
Recognizing emotions is automatically and subconsciously performed by humans. It is a vital process for human-to human communication, and thus, to achieve better human machine interaction, emotions need to be considered. Emotional speech... more
In the present work we aim at performance optimization of a speaker-independent emotion recognition system through speech feature selection process. Specifically, relying on the speech feature set defined in the Interspeech 2009 Emotion... more
The paper deals with the recording and the evaluation of a multi modal (audio/video) database of spontaneous emotions. Firstly, motivation for this work is given and different recording strategies used are described. Special attention is... more
This paper focuses on the automatic detection of a person's blood level alcohol based on automatic speech processing approaches. We compare 5 different feature types with different ways of modeling. Experiments are based on the ALC corpus... more
Obtaining speech samples is an attractive non-invasive method to recognize alcohol intoxication. In this paper, we aim to improve accuracy of speech-based intoxication recognition by decision fusion of utterance-level classifiers. On the... more
Urban sonic ecology represents a major field of research interest for exploiting the relations raised through sound between human populations and a city environment. Recently, the concept of emotional city has boosted the ideas and... more
The paper presents our efforts in the Interspeech 2011 Speaker State Challenge. Both systems, for the Intoxication and the Sleepiness Sub-Challenge, are based on a Universal Background Model (UBM) in a form of a Hidden Markov Model (HMM),... more
by Simon Dobrisek and 
1 more
In the article we evaluate the importance of different HMM states in an HMM-based feature extraction method used to model paralinguistic information. Specifically, we evaluate the distribution of the paralinguistic information across... more
In this article we present an efficient approach to modeling the acoustic features for the tasks of recognizing various paralinguistic phenomena. Instead of the standard scheme of adapting the Universal Background Model (UBM), represented... more
The paper deals with the recording and the evaluation of a multi modal (audio/video) database of spontaneous emotions. Firstly, motivation for this work is given and different recording strategies used are described. Special attention is... more
Download research papers for free!