Academia.eduAcademia.edu

Outline

Residual prediction

Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005.

https://doi.org/10.1109/ISSPIT.2005.1577150

Abstract

Residual prediction is a technique that aims at recovering the spectral details of speech that was encoded using parameterizations as linear predictive coefficients. Example applications of residual prediction are hidden Markov modelbased speech synthesis or voice conversion. Our voice conversion experiments showed that only one of the seven compared techniques was capable of successfully converting the voice while achieving a fair speech quality (i.e. mean opinion score = 3).

References (20)

  1. REFERENCES
  2. E. Moulines and Y. Sagisaka, "Voice Conversion: State of the Art and Perspectives," Speech Commu- nication, vol. 16, no. 2, 1995.
  3. K. Tokuda, H. Zen, and A. W. Black, "An HMM- Based Speech Synthesis System Applied to English," in Proc. of the IEEE Speech Synthesis Workshop, Santa Monica, USA, 2002.
  4. A. Kain, High Resolution Voice Transformation, Ph.D. thesis, Oregon Health and Science University, Port- land, USA, 2001.
  5. A. Kain and M. W. Macon, "Spectral Voice Transfor- mations for Text-to-Speech Synthesis," in Proc. of the ICASSP'98, Seattle, USA, 1998.
  6. Y. Stylianou, O. Cappé, and E. Moulines, "Statistical Methods for Voice Quality Transformation," in Proc. of the Eurospeech'95, Madrid, Spain, 1995.
  7. V. Goncharoff and P. Gries, "An Algorithm for Ac- curately Marking Pitch Pulses in Speech Signals," in Proc. of the SIP'98, Las Vegas, USA, 1998.
  8. E. Moulines and F. Charpentier, "Pitch-Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones," Speech Communication, vol. 9, no. 5, 1990.
  9. H. Duxans, A. Bonafonte, A. Kain, and J. van San- ten, "Including Dynamic and Phonetic Information in Voice Conversion Systems," in Proc. of the ICSLP'04, Jeju Island, South Korea, 2004.
  10. A. Kain and M. W. Macon, "Design and Evaluation of a Voice Conversion Algorithm Based on Spectral Envelope Mapping and Residual Prediction," in Proc. of the ICASSP'01, Salt Lake City, USA, 2001.
  11. H. Ye and S. J. Young, "Quality-Enhanced Voice Mor- phing Using Maximum Likelihood Transformations," To appear in IEEE Trans. on Speech and Audio Pro- cessing, 2005.
  12. H. Ye and S. J. Young, "High Quality Voice Morph- ing," in Proc. of the ICASSP'04, Montreal, Canada, 2004.
  13. D. Sündermann, A. Bonafonte, H. Ney, and H. H öge, "A Study on Residual Prediction Techniques for Voice Conversion," in Proc. of the ICASSP'05, Philadelphia, USA, 2005.
  14. "Adaptive Multi-Rate (AMR) Speech Transcoding," Tech. Rep. 3G TS 26.090, European Telecommuni- cations Standards Institute, Sophia Antipolis, France, 1999.
  15. D. Sündermann, H. Höge, A. Bonafonte, H. Ney, and A. W. Black, "Residual Prediction Based on Unit Se- lection," in Proc. of the ASRU'05, Cancun, Mexico, 2005.
  16. A. J. Hunt and A. W. Black, "Unit Selection in a Con- catenative Speech Synthesis System Using a Large Speech Database," in Proc. of the ICASSP'96, Atlanta, USA, 1996.
  17. D. Sündermann, A. Bonafonte, H. Ney, and H. H öge, "A First Step Towards Text-Independent Voice Con- version," in Proc. of the ICSLP'04, Jeju Island, South Korea, 2004.
  18. A. Bonafonte, I. Esquerra, A. Febrer, J. A. R. Fono- llosa, and F. Vallverdú, "The UPC Text-to-Speech System for Spanish and Catalan," in Proc. of the ICSLP'98, Sydney, Australia, 1998.
  19. D. Sündermann, A. Bonafonte, H. Ney, and H. H öge, "Time Domain Vocal Tract Length Normalization," in Proc. of the ISSPIT'04, Rome, Italy, 2004.
  20. M. Eichner, M. Wolff, and R. Hoffmann, "Voice Char- acteristics Conversion for TTS Using Reverse VTLN," in Proc. of the ICASSP'04, Montreal, Canada, 2004.