Academia.eduAcademia.edu

Outline

Developing a Text to Speech System for Dzongkha

2021, Computer Engineering and Intelligent Systems

https://doi.org/10.7176/CEIS/12-1-04

Abstract

Text to Speech plays a vital role in imparting information to the general population who have difficulty reading text but can understand spoken language. In Bhutan, many people fall in this category in adopting the national language 'Dzongkha' and system of such kind will have advantages in the community. In addition, the language will heighten its digital evolution in narrowing the digital gap. The same is more important in helping people with visual impairment. Text to speech systems are widely used in talking BOTs to news readers and announcement systems. This paper presents an attempt towards developing a working model of Text to Speech system for Dzongkha language. It also presents the development of a transcription or grapheme table for phonetic transcription from Dzongkha text to its equivalent phone set. The transcription tables for both consonants and vowels have been prepared in such a way that it facilitates better compatibility in computing. A total of 3000 sentences have been manually transcribed and recorded with a single male voice. The speech synthesis is based on a statistical method with concatenative speech generation on FESTIVAL platform. The model is generated using the two variants CLUSTERGEN and CLUNITS of the FESTIVAL speech tools FESTVOX. The development of system prototype is of the first kind for the Dzongkha language.

References (13)

  1. Black, A. & Hunt, A. (1996), "Unit selection in a concatenative speech synthesis system using a large speech database", Proceedings of ICASSP'96, 373-376.
  2. Black, A. & Lenzo, K. (2014), "Building Synthetic Voices" available at: http://festvox.org/festvox/c3174.html.
  3. Black, A. & Lenzo, K. (2014), "Building Synthetic Voices", available at: http://festvox.org/festvox/c2645.html#AEN2665.
  4. Black, A., Zen, H. & Tokuda, K. (2007), "Statistical parametric speech synthesis", Proceedings of ICASSP'07, 1229-1232.
  5. Chhoeden, D., Sherpa, U., Pemo, D. & Choejey, P. (2011), "Dzongkha Phonetic Set Description and Pronunciation Rules," Conference on Human Language Technology for Development, Alexandria, Egypt, 2-5 May 2011.
  6. Chungku, C. & Rabgay, J. (2010), "Building NLP resources for Dzongkha:A Tagset and A Tagged Corpus", Proceedings of the 8th Workshop on Asian Language Resources'10, 103-110, Beijing, China, 21-22 August 2010.
  7. Dorji, S. (1990), "Dzongkha Rabsel Lamzang (Dzongkha Phrasebook)", Dzongkha Development Commission, Royal Government of Bhutan, 1 st Edition, ASIN B004S65BJC.
  8. Driem, V. (1992), "A Grammar of Dzongkha", Dzongkha Development Commission, Royal Government of Bhutan.
  9. Isewon, I., Oyelade, J. & Oladipupo, O. (2014), "Design and Implementation of Text to Speech Conversion for Visually Impaired People", International Journal of Applied Information Systems (IJAIS), Foundation of Computer Science, New York, USA, Volume 7-No. 2, April 2014.
  10. Jamtsho, Y., & Muneesawang, P. Dzongkha Word Segmentation using Deep Learning. In 2020 12th International Conference on Knowledge and Smart Technology (KST) (pp. 1-5). IEEE.
  11. Sherpa, U., Pemo, D. & Chhoeden, D. (2008), "Pioneering Dzongkha Text-to-Speech Synthesis," available at: https://www.dit.gov.bt/sites/default/files/OCOCOSDA2008_final.pdf
  12. Vyas, H. A., & Virparia, P. V. (2020). Template-Based Transliteration of Braille Character to Gujarati Text-The Application. In Rising Threats in Expert Applications and Solutions (pp. 437-446). Springer, Singapore.
  13. Zen, H., Nose, T., Yamagishi, J, Sako, S., Masuko, T., Black, A.W. & Tokuda, K. (2007), "The HMM-based Speech Synthesis System (HTS) Version 2.0", 6th ISCA Workshop on Speech Synthesis, Bonn, Germany, August 22-24, 2007.