Data-driven voice soruce waveform modelling

Jon Gudnason

Outline

Data-driven voice soruce waveform modelling

Jon Gudnason

2009

Abstract

Abstract This paper presents a data-driven approach to the modelling of voice source waveforms. The voice source is a signal that is estimated by inverse-filtering speech signals with an estimate of the vocal tract filter. It is used in speech analysis, synthesis, recognition and coding to decompose a speech signal into its source and vocal tract filter components. Existing approaches parameterize the voice source signal with physically-or mathematically-motivated models.

References (23)

REFERENCES
A. E. Rosenberg, "Effect of Glottal Pulse shape on the Quality of Natural Vowels," J. Acoust. Soc. Amer., vol. 49, pp. 583- 590, Feb. 1971.
G. Fant, J. Liljencrants, and Q. Lin, "A four-parameter model of glottal flow," STL-QPSR, vol. 26, no. 4, pp. 1-13, 1985.
D. H. Klatt and L. C. Klatt, "Analysis, synthesis and perception of voice quality variations among female and male talkers," J. Acoust. Soc. Amer., vol. 87, no. 2, pp. 820-857, Feb. 1990.
M. D. Plumpe, T. F. Quatieri, and D. A. Reynolds, "Modeling of the Glottal Flow Derivative Waveform with Application to Speaker Identification," IEEE Trans. Speech Audio Processing, vol. 7, no. 5, pp. 569-576, Sept. 1999.
K. Ishizaka and J. Flanagan, "Synthesis of voiced sounds from a two-mass model of the vocal cords," Bell Syst. Tech. Journal, vol. 51, pp. 1233-1268, 1972.
B. H. Story and I. R. Titze, "Voice Simulation with a Body- Cover Model of the Vocal Folds," J. Acoust. Soc. Amer., vol. 97, pp. 1249-1260, 1994.
A. P. Dempster, N. M. Laird, and D. B. Rubin, "Maximum like- lihood from incomplete data via the EM algorithm," Journal of the Royal Statistical Society, Series B, vol. 39, no. 1, pp. 1-38, 1977.
S. B. Davis and P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Conti- nously Spoken Sentences," IEEE Trans Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357-366, August 1980.
J. Makhoul, "Linear Prediction: A tutorial review," Proc IEEE, vol. 63, no. 4, pp. 561-580, Apr. 1975.
P. Alku and T. Backstrom, "Normalized Amplitude Quotient for Parametrization of the Glottal Flow," J. Acoust. Soc. Amer., vol. 112, no. 2, pp. 701-710, August 2002.
T. V. Ananthapadmanabha and G. Fant, "Calculations of True Glottal Volume-Velocity and its Components," Speech Com- munication, vol. 1, pp. 167-184, 1982.
D. Childers and C. Wong, "Measuring and Modeling Vocal Source-Tract Interaction," IEEE Transactions on Biomedical Engineering, vol. 41, pp. 663-671, July 1994.
J. Holmes, "The influence of glottal waveform on the natu- ralness of speech from a parallel formant synthesizer," IEEE Trans Audio Electroacoust., vol. 21, no. 3, pp. 298-305, 1973.
P. Chytil and M. Pavel, "Variability of Glottal Pulse Estimation Using Cepstral Method," in Proc. 7th Nordic Signal Processing Symposium NORSIG 2006, 2006, pp. 314-317.
D. McElroy, B. Murray, and A. Fagan, "Wideband speech cod- ing using multiple codebooks and glottal pulses," in Proc. In- ternational Conference on Acoustics, Speech, and Signal Pro- cessing ICASSP-95, vol. 1, 1995, pp. 253-256 vol.1.
A. Bergstrom and P. Hedelin, "Code-book driven glottal pulse analysis," in Proc. International Conference on Acoustics, Speech, and Signal Processing ICASSP-89, 1989, pp. 53-56 vol.1.
G. Lindsey, A. Breen, and S. Nevard, "SPAR's Archivable Actual-Word Databases," University College London," Tech- nical Report, June 1987.
M. R. P. Thomas and P. A. Naylor, "The SIGMA Algorithm for Estimation of Reference-Quality Glottal Closure Instants from Electroglottograph Signals," in Proc European Signal Process- ing Conf, Lausanne, Switzerland, August 2008.
"IEC 61672:2003: Electroacoustics -Sound Level Meters," IEC, Tech. Rep., 2003.
M. R. P. Thomas, J. Gudnason, and P. A. Naylor, "Application of the DYPSA Algorithm to Segmented Time-Scale Modifi- cation of Speech," in Proc European Signal Processing Conf, Lausanne, Switzerland, August 2008.
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. John Wiley and Sons, 2001.
Y. Ting, D. Childers, and J. Principe, "Tracking spectral reso- nances," in Proc. Fourth Annual ASSP Workshop on Spectrum Estimation and Modeling, 1988, pp. 49-54.

Data-driven voice soruce waveform modelling

Sign up for access to the world's latest research

Abstract

Related papers

References (23)

Related papers