Academia.eduAcademia.edu

Outline

Novel speech recognition models for Arabic

2002, Johns-Hopkins …

Abstract
sparkles

AI

This research presents innovative speech recognition models tailored for Arabic, addressing unique challenges such as diacritics and lexical ambiguity. By employing various supervised and unsupervised learning approaches, the study evaluates the accuracy and out-of-vocabulary performance of these models, revealing significant improvements in recognition accuracy. The findings suggest that morphologically aware models enhance performance, thereby paving the way for more effective speech processing technologies in Arabic.

References (56)

  1. W-xalAS:M-noun+masc-sg:S-xalAS:R-xlS:P-CaCAC W-lAzim:M-adj+masc-sg:S-lAzim:R-Azm:P-CVCiC W-nitkallim:M-verb+subj-1st-plural:S-itkallim:R-klm:P-iCCaCCiC W-carabi:M-adj+masc-sg:S-carabi:R-crb:P-CaCaCi W-cala$An:M-prep:S-cala$An:R-cl$:P-CaCaCAn W-humma:M-pro+nom-3rd-plural:S-humma:R-hm:P-CuCma W-cayzIn:M-pple-act+plural:S-cAyiz:R-cwz:P-CVyiC W-%ah:M-%ah:S-%ah:R-%ah:P-%ah W-il+mukalmaB:M-noun+fem-sg+article:S-mukalmaB:R-klm:P-CuCaCCaB W-tibqa:M-verb+subj-2nd-masc-sg:S-baqa:R-bqq:P-CaCa W-bi+il+*FOR:M-bi+il+*FOR:S-bi+il+*FOR:R-bi+il+*FOR:P-bi+il+*FOR W-*FOR:M-*FOR:S-*FOR:R-*FOR:P-*FOR
  2. W-cala$An:M-prep:S-cala$An:R-cl$:P-CaCaCAn W-humma:M-pro+nom-3rd-plural:S-humma:R-hm:P-CuCma W-biysaggilu:M-verb+pres-3rd-plural:S-saggil:R-NULL:P-NULL
  3. E.T. Abdel-Massih. An Introduction to Egyptian Arabic. The University of Michigan Press, Ann Arbor, 1975.
  4. Yaser Al-Onaizan and Kevin Knight. Translating named entities using monolingual and bilingual resources. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pages 400-408, 2002.
  5. International Phonetic Alphabet. http://www.arts.gla.ac.uk/IPA/ipachart.html.
  6. S. Badawi and M. Hinds. Dictionary of Egyptian Arabic. Librairie du Liban, 1987.
  7. A. L. Berger, S.A. Della Pietra, and V.J. Della Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39-71, 1996.
  8. P. Beyerlein. Discriminative model combination. In Proc. ICASSP, pages 481-484, 1998.
  9. P. Beyerlein, W. Byrne, J.M. Huerta, S. Khudanpur, B. Marthi, J. Morgan, N. Peterek, J. Picone, D. Ver- gyri, and W. Wang. Towards language independent acoustic modeling. In Proc. ICASSP, 2000.
  10. J. Bilmes and G. Zweig. The Graphical Models Toolkit: An open source software system for speech and time-series processing. Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, 2002.
  11. J. A. Bilmes. Graphical models and automatic speech recognition. In R. Rosenfeld, M. Ostendorf, S. Khudanpur, and M. Johnson, editors, Mathematical Foundations of Speech and Language Process- ing. Springer-Verlag, New York, 2003.
  12. M. Brent, S.K. Murthy, and A. Lundberg. Discovering morphemic suffixes: a case study in MDL induction. In Proceedings of 5th Internationl Worksh on Artificial Intelligence and Statistics, 1995.
  13. P.F. Brown, V.J. Delle Pietra, P.V. deSouza, J.C. Lai, and R.L. Mercer. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467-479, 1992.
  14. T. Buckwalter. http://www.xrce.xerox.com/competencies/content-analysis/arabic/info/translit- chart.html.
  15. B. Byrne, J. Hajic, P. Ircing, F. Jelinek, S. Khudanpur, P. Krbec, and J. Psutka. On large vocabulary continuous speech recognition of highly inflectional language -Czech. In Proceedings of Eurospeech, 2001.
  16. S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. In Arivind Joshi and Martha Palmer, editors, Proceedings of the Thirty-Fourth Annual Meeting of the Association for Computational Linguistics, pages 310-318, San Francisco, 1996. Association for Computational Linguistics, Morgan Kaufmann Publishers.
  17. S. F. Chen and J. Goodman. An empirical study of smoothing techniques for language modeling. Tech- nical Report Tr-10-98, Center for Research in Computing Technology, Harvard University, Cambridge, Massachusetts, August 1998.
  18. CP1256. http://www.microsoft.com/globaldev/reference/sbcs/1256.htm.
  19. T. Schultz D. Kiecza and A. Waibel. Data-driven determination of appropriate dictionary units for korean LVCSR. In Proceedings of ICASSP, 1999.
  20. K. Darwish. Building a shallow arabic morphological analyser in one day. In Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages, Philadelphia, PA, 2002.
  21. F. Debili, H. Achour, and E Souissi. De l'étiquetage grammatical à la voyellation automatique de l'arabe. Correspondances de l'Institut de Recherche sur le Maghreb Contemporain, 17, 2002.
  22. S. Deerwestr, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society of Information Science, 41:391-407, 1990.
  23. H. DeJean. Morphemes as necessary concepts for structures: discovery from untagged corpora. In Workshop on paradigms and grounding in natural language learning, pages 295-299, Adelaide, Aus- tralia, 1998.
  24. Y. A. El-Imam. Synthesis of arabic from short sound clusters. Computer, Speech, and Language, 15:355-380, 2001.
  25. G. Zavagliakos et al. The BBN Byblos 1997 large vocabulary conversational speech recognition sys- tem. In Proceedings of ICASSP, 1998.
  26. J. Billa et al. Multilingual speech recognition: The 1996 byblos callhome system. In Proceedings of Eurospeech, pages 363-366, 1997.
  27. J. Billa et al. Arabic speech and text in Tides Ontap. In Proceedings of HLT, 2002.
  28. J. Billa et al. Audio indexing of broadcast news. In Proceedings of ICASSP, 2002.
  29. N. Friedman and D. Koller. Learning Bayesian networks from data. In NIPS 2001 Tutorial Notes. Neural Information Processing Systems, Vancouver, B.C. Canada, 2001.
  30. Ya'akov Gal. An HMM approach to vowel restoration in Arabic and Hebrew. In Proceedings of the Workshop on Computational Approaches to Semitic Languages, pages 27-33, Philadelphia, July 2002. Association for Computational Linguistics.
  31. E. Gaussier. Unsupervised learning of derivational morphology from inflectional lexicons. In Proceed- ings of the ACL Workshop on Unsupervised Learning in Natural Language Processing, University of Maryland, 1999.
  32. P. Geutner. Using morphology towards better large-vocabulary speech recognition systems. In Pro- ceedings of ICASSP, pages 445-448, 1995.
  33. H. Glotin, D. Vergyri, C. Neti, G. Potamianos, and J. Luettin. Weighting schemes for audio-visual fusion in speech recognition. In Proc. ICASSP, 2001.
  34. J. Goldsmith. Unsupervised learning of the morphology of a natural language. Computational Lin- guistics, 2001.
  35. M. Huckvale and A.C. Fang. Using phonologically-constrained morphological analysis in continuous speech recognition. Computer, Speech and Language, 16:165-181, 2002.
  36. ISO. http://www.iso.ch/cate/35040.html.
  37. R. Iyer. Improving and predicting performance of statistical language models in sparse domains. PhD thesis, Boston University, 1998.
  38. F. Jelinek. Statistical Methods for Speech Recognition. MIT Press, 1997.
  39. P. Geutner K. C,arki and T. Schultz. Turkish LVCSR: towards better speech recognition for agglutina- tive languages. In Proceedings of ICASSP, 2000.
  40. O. Karboul-Zouari. Die hocharabische Sprache und Romanisierung ihrer Schrift. Technical report, University of Karlsruhe, 1999. Senior Project Report.
  41. Kevin Knight and Jonathan Graehl. Machine transliteration. In Philip R. Cohen and Wolfgang Wahlster, editors, Proceedings of the Thirty-Fifth Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Lin- guistics, pages 128-135, Somerset, New Jersey, 1997. Association for Computational Linguistics.
  42. S.L. Lauritzen. Graphical Models. Oxford Science Publications, 1996.
  43. S. Martin, J. Liermann, and H. Ney. Automatic bigram and trigram clustering for word classes. Speech Communication, pages 19-37, 1998.
  44. J.A. Nelder and R. Mead. A simplex method for function minimization. Computing Journal, 7(4):308- 313, 1965.
  45. M.F. Porter. An algorithm for suffix stripping. Program, 14(3):130-137, 1980.
  46. Qalam. http://eserver.org/langs/qalam.txt.
  47. CAT scheme. http://almashriq.hiof.no/general/400/410/418/classical_arabic_transliteratio.
  48. P. Schone. Knowledge-free induction of inflectional morphologies. In Proceedings of NAACL, Pitts- burgh, PA, 2001.
  49. P. Schone. Towards Knowledge-Free Induction of Machine-Readable Dictionaries. PhD thesis, Uni- versity of Colorado at Boulder, 2001.
  50. P. Schone and D. Jurafsky. Knowledge-free induction of morphology using latent semantic analysis. In Proceedings of CoNLL, Lisbon, Portugal, 2000.
  51. Bonnie Glover Stalls and Kevin Knight. Translating names and technical terms in arabic text. In Pro- ceedings of the 1998 COLING/ACL Workshop on Computational Approaches to Semitic Languages, 1998.
  52. A. Stolcke. SRILM-an extensible language modeling toolkit. In Proc. Int. Conf. on Spoken Language Processing, Denver, Colorado, September 2002.
  53. Unicode. http://www.unicode.org/.
  54. D. Vergyri, S. Tsakalidis, and W. Byrne. Minimum risk acoustic clustering for multilingual acoustic model combination. In Proc. ICSLP, 2000.
  55. E.W.D. Whittaker and P.C. Woodland. Particle-based language modeling. In Proc. Int. Conf. on Spoken Language Processing, Beijing, China, 2000.
  56. D. Yarowski and R. Wicentowski. Minimally supervised morphological analysis by multimodal align- ment. In Proceedings of the ACL, Hong Kong, 2000.