Rapid development of spoken language understanding grammars
2006, Speech Communication
https://doi.org/10.1016/J.SPECOM.2005.07.001Abstract
To facilitate the development of spoken dialog systems and speech enabled applications, we introduce SGStudio (Semantic Grammar Studio), a grammar authoring tool that enables regular software developers with little speech/linguistic background to rapidly create quality semantic grammars for automatic speech recognition (ASR) and spoken language understanding (SLU). We focus on the underlying technology of SGStudio, including knowledge assisted example-based grammar learning, grammar controls and configurable grammar structures. While the focus of SGStudio is to increase productivity, experimental results show that it also improves the quality of the grammars being developed.
References (40)
- Allen, J. F., Miller, B. W., Ringger, E. K., Sikorshi, T., 1996. Robust under- standing in a dialogue system. In: 34th Annual Meeting of the Association for Computational Linguistics. Santa Cruz, California, USA, pp. 62-70.
- Bangalore, S., Johnston, M., 2004. Balancing data-driven and rule-based ap- proaches in the context of a multimodal conversational system. In: Human Language Technology/Conference of the North American Chapter of the Association for Computational Linguistics. Boston, MA, USA.
- Carpenter, B., Chu-Carroll, J., 1998. Natural language call routing: a robust, self-organizing approach. In: International Conference on Speech and Lan- guage Processing. Sydney Australia.
- Chelba, C., Mahajan, M., Acero, A., 2003. Speech utterance classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Hong Kong, China.
- Della Pietra, S., Epstein, M., Roukos, S., Ward, T., 1997. Fertility models for statistical natural language understanding. In: 35th Annual Meeting of the Association for Computational Linguistics. Madrid, Spain, pp. 168-173.
- Dempster, A. P., Laird, N., Rubin, D. B., 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39, 1-38.
- Dolfing, H., 2004. Unified language modeling using finite-state transducers with first applications. In: International Conference on Spoken Language Processing. Jeju, Korea.
- Dowding, J., Gawron, J. M., Appelt, D., Bear, J., Cherny, L., Moore, R., Moran, D., 1993. Gemini: A natural language system for spoken-language understanding. In: 31st Annual Meeting of the Association for Computa- tional Linguistics. Columbus, Ohio, pp. 54-61.
- Duda, R. O., Hart, P. E., Stork, D. G., 2001. Pattern Classification. John Wiley and Sons, Inc.
- Estéve, Y., Raymond, C., Bechet, F., Mori, R. D., 2003. Conceptual decoding for spoken dialog systems. In: Eurospeech 2003. Geneva, Switzerland.
- Fu, K. S., Booth, T. L., 1975a. Grammatical inference: Introduction and sur- vey, part 1. IEEE Transactions on Systems, Man and Cybernetics 5, 85-111.
- Fu, K. S., Booth, T. L., 1975b. Grammatical inference: Introduction and sur- vey, part 2. IEEE Transactions on Systems, Man and Cybernetics 5, 409- 423. Gorin, A., 1995. On automated language acqusition. Journal of Accoustical Society of America 97 (6), 3441-3461.
- Hakkani-Tür, D., Tur, G., Rahim, M., Riccardi, G., 2004. Unsupervised and active learning in automatic speech recognition for call classification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada.
- He, Y., Young, S., 2005. Semantic processing using the hidden vector state model. Computer Speeech and Language 19 (1), 85-106.
- Hunt, A., McGlashan, S., 2002. Speech recognition grammar specification ver- sion 1.0. http://www.w3.org/tr/speech-grammar/.
- Jelinek, F., Lafferty, J. D., Mercer, R. L., 1990. Basic methods of probabilistic context free grammars. Tech. Rep. RC 16374, IBM T.J. Watson Research Center, Yorktown Heights, N.Y.
- Kuo, H.-K. J., Zitouni, I., Fosler-Lussier, E., Ammicht, E., Lee, C.-H., 2002. Discriminative training for call classification and routing. In: International Conference on Spoken Language Processing. Denver Colorado.
- Macherey, K., Och, F. J., Ney, H., 2001. Natural language understanding using statistical machine translation. In: Eurospeech 2001.
- Miller, S., Bobrow, R., Ingria, R., Schwartz, R., 1994. Hidden understanding models of natural language. In: 31st Annual Meeting of the Association for Computational Linguistics. New Mexico State University.
- Pargellis, A., Fosler-Lussier, E., Potamianos, A., Lee, C.-H., 2001. Metrics for measuring domain independence of semantic classes. In: Eurospeech 2001. Aalborg, Denmark.
- Pieraccini, R., 2004. Spoken language understanding, the research/industry chasm. In: HLT/NAACL Workshop on Spoken Language Understanding for Conversational Systems. Boston.
- Pieraccini, R., Levin, E., 1993. A learning approach to natural language under- standing. In: 1993 NATO ASI Summer School. New Advances and Trends in Speech Recognition and Coding. Springer-Verlag, Bubion, Spain.
- Price, P., 1990. Evaluation of spoken language system: the atis domain. In: DARPA Speech and Natural Language Workshop. Hidden Valley, PA.
- Riccardi, G., Gorin, A. L., 1998. Stochastic language models for speech recog- nition and understanding. In: International Conference on Spoken Language Processing. Sidney, Australia.
- Riccardi, G., Pieraccini, R., Bocchieri, E., 1996. Stochastic automata for lan- guage modeling. Computer Speech and Language 10, 265-293.
- Ringger, E., 2000. Correcting speech recognition errors. Ph.D. thesis, Univer- sity of Rochester.
- Schapire, R. E., Rochery, M., Rahim, M., Gupta, N., 2005. Boosting with prior knowledge for call classification. IEEE Transactions on Speech and Audio Processing 13 (2), 174-181.
- Seneff, S., 1992. TINA: A natural language system for spoken language appli- cations. Computational Linguistics 18 (1), 61-86.
- Stolcke, A., Omohundro, S. M., 1994. Best-first model merging for Hidden Markov Model induction. Tech. Rep. TR-94-003, International Computer Science Institute.
- Vidal, E., Casacuberta, F., Garcia, P., 1993. Grammatical inference and ap- plications to automatic speech recognition and understanding. Tech. Rep. DSIC II/41/93, Departamento de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia.
- Wang, N. J., Shen, J.-L., Tsai, C.-H., 2004. Integrating layer concept infor- mation into n-gram modeling for spoken language understanding. In: Inter- national Conference on Spoken Language Processing. Jeju, Korea.
- Wang, Y.-Y., 1999. A robust parser for spoken language understanding. In: Eurospeech 1999. Vol. 5. ESCA, Budapest, Hungary, pp. 2055-2058.
- Wang, Y.-Y., Deng, L., Acero, A., 2005. Spoken language understanding -an introduction to the statistical framework. IEEE Signal Processing Mazagine 22 (5).
- Wang, Y.-Y., Ju, Y.-C., 2004. Creating speech recognition grammars from regular expressions for alphanumeric concepts. In: International Conference on Spoken Language Processing. Jeju, Korea.
- Wang, Y.-Y., Mahajan, M., Huang, X., 2000. A unified context-free grammar and n-gram model for spoken language processing. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Istanbul, Turkey.
- Wang, Y.-Y., Waibel, A., 1998. Modeling with structures in statistical ma- chine translation. In: 36th Annual Meeting of the Association for Compu- tational Linguistics/17th International Conference on Computational Lin- guistics. Montral, Qubec, Canada.
- Ward, W., 1994. Recent improvements in the CMU spoken language under- standing system. In: Human Language Technology Workshop. Plainsboro, New Jersey.
- Wong, C.-C., Meng, H., 2001. Improvements on a semi-automatic grammar induction framework. In: IEEE Automatic Speech Recognition and Under- standing Workshop. Madonna di Campiglio, Italy.
- Woods, W. A., 1983. Language processing for speech understanding. In: Com- puter Speech Processing. Prentice-Hall International, Englewood Cliffs, NJ.
- Young, S., 1993. The htk hidden markov model toolkit: design and philosophy. Tech. Rep. TR.153, Department of Engineering, Cambridge University.