Key research themes
1. How can residual prediction techniques improve spectral detail and naturalness in voice conversion?
This theme investigates methods to predict or reconstruct the residual (excitation) signals in voice conversion frameworks, aiming to enhance the spectral details and naturalness of the converted speech. Since spectral envelope transformation alone often results in over-smoothed or synthetic-sounding speech, incorporating accurate residual prediction is critical. Research focuses on comparing residual prediction techniques, modeling the correlation between spectral features and residuals, and developing methods that better preserve speaker-dependent excitation characteristics.
2. What strategies enable non-parallel voice conversion by establishing frame-level or sequence-level mappings without shared parallel data?
Non-parallel voice conversion (VC) methods aim to build conversion systems without the need for parallel utterances of source and target speakers, which is significant for practical deployment. This research theme explores algorithms to discover correspondences between frames or segments of unaligned source and target speech through clustering, recognition, or iterative alignment methods. Approaches include DNN-HMM-based frame recognition, iterative nearest-neighbor alignment with temporal context, and latent space embeddings to create mapping functions, prioritizing alignment accuracy and quality of synthesized speech.
3. How can system fusion and hybrid methods enhance voice conversion performance by leveraging complementary strengths of distinct approaches?
This theme addresses the combination of multiple voice conversion techniques to harness their complementary advantages, such as statistical robustness, spectral detail preservation, and prosodic naturalness. By fusing systems like Gaussian mixture models (GMM) and frequency warping (FW), or exemplar-based and parametric methods, researchers can create hybrid frameworks that yield better speaker similarity and naturalness than individual methods. The research evaluates the feasibility, integration designs, and empirical gains in objective and subjective metrics.