VTLN-based cross-language voice conversion
2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)
https://doi.org/10.1109/ASRU.2003.1318521Abstract
In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As cross-language voice conversion aims at the transformation of a source speaker's voice into that of a target speaker using a different language, we want to investigate whether VTLN is an appropriate method to adapt the voice characteristics. After applying several conventional VTLN warping functions, we extend the conventional piece-wise linear function to several segments, allowing a more detailed warping of the source spectrum. Experiments on cross-language voice conversion are performed on three corpora of two languages and both speaker genders.
References (13)
- REFERENCES
- T. Kamm, G. Andreou, and J. Cohen, "Vocal tract nor- malization in speech recognition: Compensating for systematic speaker variability," in Proc. of the 15th Annual Speech Research Symposium, Baltimore, USA, 1995.
- E. Moulines and Y. Sagisaka, "Voice conversion: State of the art and perspectives," in Speech Communica- tion, 16(2), 1995.
- Y. Gao and A. Waibel, "Speech-to-speech translation," in Proc. of the ACL'02 Workshop on Speech-to-Speech Translation, Philadelphia, USA, 2002.
- O. Türk, "New methods for voice conversion," in PhD Thesis, Bogazic ¸i University, Istanbul, Turkey, 2003.
- A. Kain and M. W. Macon, "Spectral voice transfor- mations for text-to-speech synthesis," in Proc. of the ICASSP'98, Sydney, Australia, 1998.
- M. Mashimo, T. Toda, K. Shikano, and N. Camp- bell, "Evaluation of cross-language voice conversion based on gmm and straight," in Proc. of the EU- ROSPEECH'01, Aalborg, Denmark, 2001.
- D. Sündermann and H. Ney, "An automatic segmen- tation and mapping approach for voice conversion pa- rameter training," in Proc. of the AST'03, Maribor, Slovenia, 2003.
- L. F. Uebel and P. C. Woodland, "An investigation into vocal tract length normalization," in Proc. of the EUROSPEECH'99, Budapest, Hungary, 1999.
- E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. of the ICASSP'96, Atlanta, USA, 1996.
- M. Pitz, S. Molau, R. Schlüter, and H. Ney, "Vo- cal tract normalization equals linear transformation in cepstral space," in Proc. of the EUROSPEECH'01, Aalborg, Denmark, 2001.
- A. Acero and R. M. Stern, "Robust speech recognition by normalization of the acoustic space," in Proc. of the ICASSP'91, Toronto, Canada, 1991.
- J. McDonough, "Speaker normalization with all-pass transforms," in Technical Report No. 28, Center for Language and Speech Processing, John Hopkins Uni- versity, 1998.