Papers by Anderson Fraiha Machado
Abstract—The crosslingual voice conversion problem refers to the replacement of a speaker’s timbr... more Abstract—The crosslingual voice conversion problem refers to the replacement of a speaker’s timbre or vocal identity in a recorded sentence, assuming that the source speaker and target speaker use different languages. This problem differs from typical voice conversion in the sense that the mapping of acoustical features cannot depend on time-aligned recordings of source and target speakers uttering the same sentences. This paper presents an overview of a general crosslingual voice conversion system and discusses the most important techniques used in each step of the conversion process. Keywords-Crosslingual Voice Conversion; Timbre transformation; Vocal Identity;

The aim of this paper is to present a new approach to spectral representation using sums of Gauss... more The aim of this paper is to present a new approach to spectral representation using sums of Gaussian distributions. Sums of Gaussians provide an intuitive representation for frequency bands of a signal spectrum as well as formant regions. The representation of spectral envelopes using Gaussian parameters {a, μ, σ} simplifies the expression of important tasks such as frequency warping and formant manipulation. Marquardt’s algorithm has been extended to estimate parameters of the Gaussian models for each frequency band, allowing each Gaussian parameter to be either optimized for fitting a given spectral sub-band, or else have a fixed value for reducing the number of model parameters. This allows for several choices on the sets of free/fixed parameters and the sizes of models. Experimental results show that the models proposed offer an accurate approximation of spectral envelope, and provide good perceptual results when applied to pitch shifting.
A cross-lingual voice conversion system aims at modifying the timbral structure of recorded sente... more A cross-lingual voice conversion system aims at modifying the timbral structure of recorded sentences from a source speaker, in order to obtain processed sentences which are perceived as the same sentences uttered by a target speaker. This work presents the cross-lingual voice conversion problem as a network of related sub-problems and discuss several techniques for solving each of these sub-problems, in the context of a modular implementation that facilitates comparisons between competing techniques. The implemented system aims at high-quality cross-lingual voice conversion in a text-independent setting, i.e. where the training sets of sentences recorded by source and target speakers are not the same. New strategies are introduced, such as artificial phonetic maps, N -likelihood clustering and normalized frequency warping, which are evaluated through numerical experiments.

Voice conversion is an emergent problem in voice and speech processing with increasing commercial... more Voice conversion is an emergent problem in voice and speech processing with increasing commercial interest, due to applications such as Speech-to-Speech Translation (SST) and personalized Text-To-Speech (TTS) systems. A Voice Conversion system should allow the mapping of acoustical features of sentences pronounced by a source speaker to values corresponding to the voice of a target speaker, in such a way that the processed output is perceived as a sentence uttered by the target speaker. In the last two decades the number of scientic contributions to the voice conversion problem has grown considerably, and a solid overview of the historical process as well as of the proposed techniques is indispensable for those willing to contribute to the eld. The goal of this work is to provide a critical survey that combines historical presentation to technical discussion while pointing out advantages and drawbacks of each technique, and from this study, to develop new tools. Some contributions proposed in this work include a method for spectral decomposition in terms of radial basis functions, articial phonetic map, warping functions among others, in order to implement a text-independent crosslingual voice conversion system of high quality.
2009 XXII Brazilian Symposium on Computer Graphics and Image Processing, 2009
This work presents a new and fast algorithm for binary morphological erosions with arbitrary shap... more This work presents a new and fast algorithm for binary morphological erosions with arbitrary shaped structuring elements inspired by preprocessing techniques that are quite similar to those presented in many fast string matching algorithms (jumps and miss-matchings). The result of these preprocessing techniques is a speed up for computing binary erosions. A time complexity analysis shows that this algorithm has clear advantages over some known implementations. Experimental results confirm this analysis and shows that this algorithm has a good performance and can be a better option for erosions computation.
Parametric decomposition of the spectral envelope
2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013
International Symposium on Multimedia, 2010
The cross lingual voice conversion problem refers to the replacement of a speaker's timbre or... more The cross lingual voice conversion problem refers to the replacement of a speaker's timbre or vocal identity in a recorded sentence, assuming that the source speaker and target speaker use different languages. This problem differs from typical voice conversion in the sense that the mapping of acoustical features cannot depend on time-aligned recordings of source and target speakers uttering the
Uploads
Papers by Anderson Fraiha Machado