Academia.eduAcademia.edu

Outline

VTLN-based cross-language voice conversion

2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721)

https://doi.org/10.1109/ASRU.2003.1318521

Abstract

In speech recognition, vocal tract length normalization (VTLN) is a well-studied technique for speaker normalization. As cross-language voice conversion aims at the transformation of a source speaker's voice into that of a target speaker using a different language, we want to investigate whether VTLN is an appropriate method to adapt the voice characteristics. After applying several conventional VTLN warping functions, we extend the conventional piece-wise linear function to several segments, allowing a more detailed warping of the source spectrum. Experiments on cross-language voice conversion are performed on three corpora of two languages and both speaker genders.

References (13)

  1. REFERENCES
  2. T. Kamm, G. Andreou, and J. Cohen, "Vocal tract nor- malization in speech recognition: Compensating for systematic speaker variability," in Proc. of the 15th Annual Speech Research Symposium, Baltimore, USA, 1995.
  3. E. Moulines and Y. Sagisaka, "Voice conversion: State of the art and perspectives," in Speech Communica- tion, 16(2), 1995.
  4. Y. Gao and A. Waibel, "Speech-to-speech translation," in Proc. of the ACL'02 Workshop on Speech-to-Speech Translation, Philadelphia, USA, 2002.
  5. O. Türk, "New methods for voice conversion," in PhD Thesis, Bogazic ¸i University, Istanbul, Turkey, 2003.
  6. A. Kain and M. W. Macon, "Spectral voice transfor- mations for text-to-speech synthesis," in Proc. of the ICASSP'98, Sydney, Australia, 1998.
  7. M. Mashimo, T. Toda, K. Shikano, and N. Camp- bell, "Evaluation of cross-language voice conversion based on gmm and straight," in Proc. of the EU- ROSPEECH'01, Aalborg, Denmark, 2001.
  8. D. Sündermann and H. Ney, "An automatic segmen- tation and mapping approach for voice conversion pa- rameter training," in Proc. of the AST'03, Maribor, Slovenia, 2003.
  9. L. F. Uebel and P. C. Woodland, "An investigation into vocal tract length normalization," in Proc. of the EUROSPEECH'99, Budapest, Hungary, 1999.
  10. E. Eide and H. Gish, "A parametric approach to vocal tract length normalization," in Proc. of the ICASSP'96, Atlanta, USA, 1996.
  11. M. Pitz, S. Molau, R. Schlüter, and H. Ney, "Vo- cal tract normalization equals linear transformation in cepstral space," in Proc. of the EUROSPEECH'01, Aalborg, Denmark, 2001.
  12. A. Acero and R. M. Stern, "Robust speech recognition by normalization of the acoustic space," in Proc. of the ICASSP'91, Toronto, Canada, 1991.
  13. J. McDonough, "Speaker normalization with all-pass transforms," in Technical Report No. 28, Center for Language and Speech Processing, John Hopkins Uni- versity, 1998.