This paper presents an automated Public Address processing unit, using delay and magnitude response adjustment. The aim is to achieve a flat frequency response and delay adjustment between different physically-placed speakers at the... more
This paper presents an automated Public Address processing unit, using delay and magnitude response adjustment. The aim is to achieve a flat frequency response and delay adjustment between different physically-placed speakers at the... more
Retrieving a music file from a large database is a non-trivial task. To support this task, many mechanisms have been developed over the years. However, indexing files remains one of the most popular mechanisms. Several algorithms allow... more
Over the last few years, most of the tasks employing Deep Learning techniques for audio processing have achieved stateof-the-art results employing Conformer-based systems. However, when it comes to sound event detection (SED), it was... more
RESUMEN En este articulo presenta el diseño, implementación y comparación de técnicas de control activo de ruido, estos controladores se basa en técnicas de control adaptativo mediante un filtro FIR y el algoritmo LMS, el principio de... more
The Pulse Code Modulation is the vital part of Analog to Digital Converter (ADC). The PCM includes the process of sampling and quantization in order to digitize the analog input along the time scale and amplitude scale respectively. This... more
Based on perceptual and computational attention modeling studies, we formulate measures of saliency for an audiovisual stream. Audio saliency is captured by signal modulations and related multifrequency band features, extracted through... more
This document provides different tools and techniques useful for debugging audio issues in Linux, covering problems from booting to shutdown. This guide is not exhaustive but aims to explain potential audio issues or bugs that can arise... more
This paper investigates the contextual recognition of neutral thirds in music by integrating real-world musical context into the study of categorical perception. Traditionally, categorical perception has been studied using isolated... more
Auscultation is one of the most used techniques for detecting cardiovascular diseases, which is one of the main causes of death in the world. Heart murmurs are the most common abnormal finding when a patient visits the physician for... more
Speech recognition has become an important task to improve the human-machine interface. Taking into account the limitations of current automatic speech recognition systems, like non-real time cloud-based solutions or power demand, recent... more
Sound events possess certain temporal and spectral structure in their time-frequency representations. The spectral content for the samples of the same sound event class may exhibit small shifts due to intra-class acoustic variability.... more
We present an extensive evaluation of a wide variety of promising design patterns for automated deep-learning (AutoDL) methods, organized according to the problem categories of the 2019 AutoDL challenges, which set the task of optimizing... more
Negli ultimi anni si è assistito a un incredibile aumento del numero di smartcard prodotte sul mercato; è sufficiente pensare alla quantità di SIM card presenti nel mondo per farsi un’idea del volume complessivo di vendite che nel solo... more
RESUMEN En este articulo presenta el diseño, implementación y comparación de técnicas de control activo de ruido, estos controladores se basa en técnicas de control adaptativo mediante un filtro FIR y el algoritmo LMS, el principio de... more
Auscultation is one of the most used techniques for detecting cardiovascular diseases, which is one of the main causes of death in the world. Heart murmurs are the most common abnormal finding when a patient visits the physician for... more
A graphic equalizer is an adjustable filter in which the command gain of each frequency band is practically independent of the gains of other bands. Designing a graphic equalizer with a high precision requires evaluating a target response... more
We address the problem of blind audio source separation in the under-determined and convolutive case. The contribution of each source to the mixture channels in the time-frequency domain is modeled by a zero-mean Gaussian random vector... more
In this paper we propose a novel time-space ensemble-based approach for the task of automatic music genre classification. Ensemble strategies employ several classifiers to different views of the problemspace, and combination rules in... more
In questa tesi viene proposta e sperimentata una metodologia per il supporto di applicazioni grid-aware espresse come moduli paralleli ASSIST puri e incapsulati in componenti CCM (CORBA Component Model). La metodologia oggetto della tesi... more
Recently, there is an increasing tendency to embed functionalities for recognizing emotions from user generated media content in automated systems such as call-centre operations, recommendations and assistive technologies, providing... more
This contribution proposes an efficient method for the detection ofrelevant changes in continuous stream of sound. The detectedchange-points can then serve for the segmentation of long audiorecordings into shorter and more or less... more
Close-microphone techniques are extensively employed in many live music recordings, allowing for interference rejection and reducing the amount of reverberation in the resulting instrument tracks. However, despite the use of directional... more
We address the problem of identifying the constituent sources in a single-sensor mixture signal consisting of contributions from multiple simultaneously active sources. We propose a generic framework for mixture signal analysis based on a... more
Finally, I would like to thank all my friends and family who tried to understand and respect my choice, and who never tried to prevent me from doing this "crazy" jump into the void. I am coming back better and stronger, and this is all... more
The develop of Antescofo software has allowed contemporary musicians to create interactive music pieces in a more precise way in terms of the synchronization between human and machine. INRIA's MUTANT team has been developing a version of... more
A recent trend is to use Music Information Retrieval algorithms for creativity. When considering the audio signal as observation, a well-known method of data-driven synthesis is the "concatenative synthesis" also named musaicing (audio... more
A swarm of bees buzzing "Let it be" by the Beatles or the wind gently howling the romantic "Gute Nacht" by Schubert-these are examples of audio mosaics as we want to create them. Given a target and a source recording, the goal of audio... more
Cough efficacy is considered a reliable predictor of the aspiration risk in head and neck cancer patients with radiation-associated dysphagia. Currently, coughing is assessed perceptually or aerodynamically. The goal of our research is to... more
Music genre classification has its own popularity index in the present times. Machine learning can play an important role in the music streaming task. This research article proposes a machine learning based model for the classification of... more
This paper introduces a novel declipping algorithm based on constrained least-squares minimization. Digital speech signals are often sampled at 16 kHz and classic declipping algorithms fail to accurately reconstruct the signal at this... more
Computer-assisted audio recordings provide a new approach for detecting and correcting interviewer coding error. For questions with categorical and other specify responses, it is possible for the interviewer to misinterpret, abbreviate,... more
This paper investigates the use of a physical model template database as the parameter basis for a MPEG-4 Structured Audio (MP4-SA) codec. During analysis, the codec attempts to match the closest corresponding instrument in the database.... more
While humans can act as effective sensors, human input is subject to a high degree of error and highly dependent on the context. Furthermore, extracting the signal from the noise for social sensing is a difficult challenge. One approach... more
Multivariate Analysis (MVA) comprises a family of well-known methods for feature extraction which exploit correlations among input variables representing the data. One important property that is enjoyed by most such methods is... more
Wireless sensor network (WSN) has proliferated rapidly as a cost-effective solution for data aggregation and measurements under challenging environments. Sensors in WSNs are cheap, powerful, and consume limited energy. The energy... more
While previous generations of the MPEG multimedia standard have focused primarily on coding and transmission of content digitally sampled from the real world, MPEG-4 contains extensive support for structured, synthetic and... more
We present a new method for the discrimination of explosive cough events, which is based on a combination of spectral content descriptors and pitch-related features. After the removal of near-silent segments, a vector of event boundaries... more
The problem of modeling a signal segment as a sum of exponentially damped sinusoidal components arises in many different application areas, including speech and audio processing. Often, model parameters are estimated using subspace based... more
In the past few years, several case studies have illustrated that the use of occupancy information in buildings leads to energy-efficient and low-cost HVAC operation. The widely presented techniques for occupancy estimation include... more
In this paper, we describe our multi-resolution mean teacher systems for DCASE 2021 Task 4: Sound event detection and separation in domestic environments. Aiming to take advantage of the different lengths and spectral characteristics of... more
Cough detection and classification present necessary tools for the evaluation of pathology severity in chronic illnesses. In literature, several approaches have been proposed for this aim. The latter presented a relative success since... more
Cough efficacy is considered a reliable predictor of the aspiration risk in head and neck cancer patients with radiation-associated dysphagia. Currently, coughing is assessed perceptually or aerodynamically. The goal of our research is to... more
Cough efficacy is considered a reliable predictor of the aspiration risk in head and neck cancer patients with radiation-associated dysphagia. Currently, coughing is assessed perceptually or aerodynamically. The goal of our research is to... more
A fast iterative Kernel Principal Component Analysis (KPCA) is proposed to extract features from hyperspectral images. The proposed method is a kernel version of the Candid Covariance-Free Incremental Principal Component Analysis, which... more
This abstract describes the tempo extraction algorithm used for the University of Victoria submission to the MIREX (Music Information Retrieval Exchange) 2005. The algorithm is mostly based on self-similarity rather than onset detection.... more