Academia.eduAcademia.edu

Outline

Generation of Arabic phonetic dictionaries for speech recognition

2008

Abstract

Phonetic dictionaries are essential components of large-vocabulary natural language speakerindependent speech recognition systems. This paper presents a rule-based technique to generate Arabic phonetic dictionaries for a large vocabulary speech recognition system. The system used classic Arabic pronunciation rules, common pronunciation rules of Modern Standard Arabic, as well as morphologically driven rules. The paper gives in detail an explanation of these rules as well as their formal mathematical presentation. The rules were used to generate a dictionary for a 5.4 hours corpus of broadcast news. The phonetic dictionary contains 23,841 definitions corresponding to about 14232 words. The generated dictionary was evaluated on an actual Arabic speech recognition system. The pronunciation rules and the phone set were validated by test cases. The Arabic speech recognition system achieves word error rate of %11.71 for fully diacritized transcription of about 1.1 hours of Arabic broadcast news.

References (16)

  1. References
  2. X.Huang, A. Acero, and H. Hon, Spoken Language Processing, Prentice Hall PTR, 2001.
  3. Young, S. (1996), "A review of large-vocabulary continuous-speech recognition", IEEE Signal Processing Magazine, pages 45-57, 1996.
  4. Carnegie Mellon University. CMU pronouncing dictionary. http://www.speech.cs.cmu.edu/cgi-bin/cmudict
  5. M. Elshafei Ahmed, "Toward an Arabic Text-to- Speech System", The Arabian Journal of Science and Engineering, Vol. 16, No. 4B, pp.565-583, 1991.
  6. Mansour Alghamdi, Husni Almuhtasib, Mustafa Elshafei, "Arabic Phonological Rules", King Saud University Journal: Computer Sciences and Information. Vol. 16, pp. 1-25, 2004.
  7. K.F. Lee, "Large Vocabulary Speaker-Independent Continuous Speech Recognition: The SPHINX System," PhD Thesis, Carnegie Mellon University, 1988.
  8. HTK speech recognition tool kit. http://htk.eng.cam.ac.uk/
  9. Algamdi, Mansour, "KACST Arabic Phonetics Database", The Fifteenth International Congress of Phonetics Science, Barcelona, 3109-3112, 2003.
  10. Moustafa Elshafei, Husni Almuhtasib and Mansour Alghamdi, "Techniques for High Quality Text-to-speech", Information Science, 140 (3-4) 255-267, 2002.
  11. Mohamed Ali, Moustafa Elshafei, Husni Al-Muhtaseb, and Mansour Al-Ghamdi, " Automatic Segmentation of Arabic Speech", Workshop on Information Technology and Islamic Sciences, Imam Mohammad Ben Saud University, Riyadh, March 2007.
  12. Moustafa Elshafei, Husni Al-Muhtaseb, and Mansour Alghamdi, "Machine Generation of Arabic Diacritical Marks", Proceedings of the 2006 International Conference on Machine Learning; Models, Technologies, and Applications (MLMTA'06), June 2006, USA.
  13. H. Satori, M. Harti, N. Chenfour ,"Introduction to Arabic Speech Recognition Using CMU Sphinx System", Information and Communication Technologies International Symposium proceeding ICTIS07, 2007.
  14. Hussein A.R. Hiyassat, Automatic Pronuncation Dictionary Toolkit for Arabic Speech Recognition Using SPHINX Engine, Ph.D., Arab Academy for Banking and Financial Sciences, Amman, Jordan, 2007.
  15. K. Kirchhofl, J.Bilmes, S. Das, N. Duta, M. Egan, G.
  16. Ji, F. He, J. Henderson, D. Liu, M. Noamany, P. Schoner, R. Schwartz, and D. Vergyri, "Novel Approaches to Arabic Speech Recognition: Report from the 2002 John-Hopkins Summer Workshop", ICASSP 2003, pp. I-344-I347, 2003.