Data-driven voice soruce waveform modelling
2009
Abstract
Abstract This paper presents a data-driven approach to the modelling of voice source waveforms. The voice source is a signal that is estimated by inverse-filtering speech signals with an estimate of the vocal tract filter. It is used in speech analysis, synthesis, recognition and coding to decompose a speech signal into its source and vocal tract filter components. Existing approaches parameterize the voice source signal with physically-or mathematically-motivated models.
References (23)
- REFERENCES
- A. E. Rosenberg, "Effect of Glottal Pulse shape on the Quality of Natural Vowels," J. Acoust. Soc. Amer., vol. 49, pp. 583- 590, Feb. 1971.
- G. Fant, J. Liljencrants, and Q. Lin, "A four-parameter model of glottal flow," STL-QPSR, vol. 26, no. 4, pp. 1-13, 1985.
- D. H. Klatt and L. C. Klatt, "Analysis, synthesis and perception of voice quality variations among female and male talkers," J. Acoust. Soc. Amer., vol. 87, no. 2, pp. 820-857, Feb. 1990.
- M. D. Plumpe, T. F. Quatieri, and D. A. Reynolds, "Modeling of the Glottal Flow Derivative Waveform with Application to Speaker Identification," IEEE Trans. Speech Audio Processing, vol. 7, no. 5, pp. 569-576, Sept. 1999.
- K. Ishizaka and J. Flanagan, "Synthesis of voiced sounds from a two-mass model of the vocal cords," Bell Syst. Tech. Journal, vol. 51, pp. 1233-1268, 1972.
- B. H. Story and I. R. Titze, "Voice Simulation with a Body- Cover Model of the Vocal Folds," J. Acoust. Soc. Amer., vol. 97, pp. 1249-1260, 1994.
- A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum like- lihood from incomplete data via the EM algorithm," Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1-38, 1977.
- S. B. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Conti- nously Spoken Sentences," IEEE Trans Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357-366, August 1980.
- J. Makhoul, "Linear Prediction: A tutorial review," Proc IEEE, vol. 63, no. 4, pp. 561-580, Apr. 1975.
- P. Alku and T. Backstrom, "Normalized Amplitude Quotient for Parametrization of the Glottal Flow," J. Acoust. Soc. Amer., vol. 112, no. 2, pp. 701-710, August 2002.
- T. V. Ananthapadmanabha and G. Fant, "Calculations of True Glottal Volume-Velocity and its Components," Speech Com- munication, vol. 1, pp. 167-184, 1982.
- D. Childers and C. Wong, "Measuring and Modeling Vocal Source-Tract Interaction," IEEE Transactions on Biomedical Engineering, vol. 41, pp. 663-671, July 1994.
- J. Holmes, "The influence of glottal waveform on the natu- ralness of speech from a parallel formant synthesizer," IEEE Trans Audio Electroacoust., vol. 21, no. 3, pp. 298-305, 1973.
- P. Chytil and M. Pavel, "Variability of Glottal Pulse Estimation Using Cepstral Method," in Proc. 7th Nordic Signal Processing Symposium NORSIG 2006, 2006, pp. 314-317.
- D. McElroy, B. Murray, and A. Fagan, "Wideband speech cod- ing using multiple codebooks and glottal pulses," in Proc. In- ternational Conference on Acoustics, Speech, and Signal Pro- cessing ICASSP-95, vol. 1, 1995, pp. 253-256 vol.1.
- A. Bergstrom and P. Hedelin, "Code-book driven glottal pulse analysis," in Proc. International Conference on Acoustics, Speech, and Signal Processing ICASSP-89, 1989, pp. 53-56 vol.1.
- G. Lindsey, A. Breen, and S. Nevard, "SPAR's Archivable Actual-Word Databases," University College London," Tech- nical Report, June 1987.
- M. R. P. Thomas and P. A. Naylor, "The SIGMA Algorithm for Estimation of Reference-Quality Glottal Closure Instants from Electroglottograph Signals," in Proc European Signal Process- ing Conf, Lausanne, Switzerland, August 2008.
- "IEC 61672:2003: Electroacoustics -Sound Level Meters," IEC, Tech. Rep., 2003.
- M. R. P. Thomas, J. Gudnason, and P. A. Naylor, "Application of the DYPSA Algorithm to Segmented Time-Scale Modifi- cation of Speech," in Proc European Signal Processing Conf, Lausanne, Switzerland, August 2008.
- R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. John Wiley and Sons, 2001.
- Y. Ting, D. Childers, and J. Principe, "Tracking spectral reso- nances," in Proc. Fourth Annual ASSP Workshop on Spectrum Estimation and Modeling, 1988, pp. 49-54.