Implementation architectures for natural language generation
2004, Natural Language Engineering
https://doi.org/10.1017/S1351324904003511Abstract
Generic software architectures aim to support re-use of components, focusing of research and development effort, and evaluation and comparison of approaches. In the field of natural language processing, generic frameworks for understanding have been successfully deployed to meet all of these aims, but nothing comparable yet exists for generation. The nature of the task itself, and the current methodologies available to research it, seem to make it more difficult to reach the necessary level of consensus to support generic proposals. Recent work has made progress towards establishing a generic framework for generation at the functional level, but left open the issue of actual implementation. In this paper, we discuss the requirements for such an implementation layer for generation systems, drawing on two initial attempts to implement it. We argue that it is possible and useful to distinguish "functional architecture" from "implementation architecture" for generation systems. 1 The Case for a Generic Software Architecture for NLG Most natural language generation (NLG) systems have some kind of modular structure. The individual modules may differ in complex ways, according to whether they are based on symbolic or statistical models, what particular linguistic theories they embrace and so on. Ideally, such modules could be reused in other NLG systems. This would avoid duplication of work, allow realistic research specialisation and allow empirical comparison of different approaches. Examples of ideas that might give rise to reusable modules include:
References (43)
- Barrutieta, G., J. Abaitua and J. Diaz. (2002) Cascading XSL Filters for Content Selection in Multilingual Document Generation. Procs of the Second Workshop on NLP and XML, Taipei.
- Bateman, J. (1997) Enabling Technology for Multilingual Natural Language Generation: The KPML Development Environment. Natural Language Engineering, 3(1):15-55.
- Bayer, S., C. Doran and B. George. (2001) Dialogue Interaction with the DARPA Com- municator Infrastructure: The Development of Useful Software. Poster presentation, HLT 2001.
- Beale, S., S. Nirenburg, E. Viegas and L. Wanner. (1998) De-Constraining Text Gen- eration. Procs of the Ninth International Workshop on Natural Language Generation, Niagara-on-the-Lake.
- Brew, C., D. McKelvie, R. Tobin, H. Thompson and A. Mikheev. (2000) The XML Library LT XML version 1.2. Obtainable at http://www.ltg.ed.ac.uk/software/xml/xmldoc/xmldoc.html.
- Cahill, L. (1999) Lexicalisation in Applied NLG Systems. Technical Report ITRI-99-04, ITRI, University of Brighton. obtainable at http://www.itri.brighton.ac.uk/rags/.
- Cahill, L. and M. Reape. (1999) Component tasks in applied NLG sys- tems. Technical Report ITRI-99-05, ITRI, University of Brighton. obtainable at http://www.itri.brighton.ac.uk/rags/.
- Cahill, L., J. Carroll, R. Evans, D. Paiva, R. Power, D. Scott, and K. van Deemter. (2001) From RAGS to RICHES: exploiting the potential of a flexible generation architecture. Procs of the 39th Annual Meeting of the Association for Computational Linguistics (ACL-01), pages 98-105, Toulouse, France.
- Cahill, L., R. Evans, C. Mellish, D. Paiva, M. Reape, and D. Scott. (2001) The RAGS Ref- erence Manual . Technical Report ITRI-01-08, ITRI, University of Brighton. Available at http://www.itri.brighton.ac.uk/rags.
- Carroll, J., A. Copestake, D. Flickinger and V. Poznanski. (1999) An efficient chart gen- erator for (semi-)lexicalist grammars. Procs of the 7th European Workshop on Natural Language Generation (EWNLG'99), pages 86-95, Toulouse.
- Cheng, H., M. Poesio, R. Henschel and C. Mellish. (2001) Corpus-based NP Modifier Gen- eration. Procs of the Second Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-01), Pittsburgh.
- Cunningham, H., Y. Wilks, and R. Gaizauskas. (1996) GATE -a general architecture for text engineering. Procs of the 16th International Conference on Computational Linguistics (COLING'96), volume 2, pages 1057-1060, Copenhagen.
- Danlos, L. (1984) Conceptual and Linguistic Decisions in Generation. Procs of the 10th International Conference on Computational Linguistics (COLING'84), Stanford.
- De Smedt, K. (1994) Parallelism in Incremental Sentence Generation. In G. Adriaens and U. Hahn (Eds) Parallel Natural Language Processing, Ablex, pages 421-447.
- De Smedt, K., H. Horacek, and M. Zock. (1996) Architectures for Natural Language Generation: Problems and Perspectives. In G. Adorni and M. Zock, (Eds) Trends in Natural Language Generation, Springer Verlag, pages 17-46.
- Elhadad, M. and J. Robin. (1992) Controlling content realization with functional uni- fication grammars. In R. Dale, E. Hovy, D. Roesner and O. Stock (Eds) Aspects of Automated Natural Language Generation, Lecture Notes in Artificial Intelligence, 587. Springer-Verlag, pages 89-104.
- Evans R., P. Piwek and L. Cahill. (2002) What is NLG? In Procs of the Second In- ternational Conference on Natural Language Generation (INLG-02), New York, pages 144-151.
- Grishman, R. (1995) TIPSTER Phase II Architecture Design Document Version 1.52. Technical Report, Dept of Computer Science, New York University.
- Herzog, G., A. Ndiaye, S. Merten, H. Kirchmann, T. Becker and P. Poller. (2004) Large- Scale Software Integration for Spoken Language and Multimodal Dialog Systems. Nat- ural Language Engineering, this issue.
- Hirschman, L. and R. Gaizauskas. (2001) Natural Language Question Answering: the view from here. Natural Language Engineering, 7(4):275-300.
- Hobbs, J. (1993) The generic information extraction system. Procs of the fifth Message Understanding Conference (MUC-5), Morgan Kaufman.
- Ide, N. and L. Romary. (2004) International Standard for a Linguistic Annotation Frame- work. Natural Language Engineering, this issue.
- Langkilde, I. and K. Knight. (1998) Generation that Exploits Corpus-based Statistical Knowledge. Procs of the Conference of the Association for Computational Linguistics (COLING/ACL-98).
- Langkilde-Geary, I. (2002) An Empirical Verification of Coverage and Correctness for a General-Purpose Sentence Generator. Procs of the Second International Conference on Natural Language Generation (INLG-02), New York, pages 17-24.
- Laprun, C., J. Fiscus, J. Garafolo and S. Pajot. (2002) A Practical Introduction to ATLAS. Procs of the Third International Conference on Language Resources and Evaluation (LREC-02), Las Palmas. Mellish and Evans
- Lavoie, B. and O. Rambow. (1997) A fast and portable realiser for text generation. Procs of the Fifth Conference on Applied Natural Language Processing (ANLP-97), Washington, pages 265-268.
- McDonald, D. (1981) MUMBLE: A Flexible System for Language Production. Procs of the Seventh International Joint Conference on Artificial Intelligence (IJCAI-81), Vancouver, page 1062.
- Mellish, C., A. Knott, J. Oberlander and M. O'Donnell. (1998) Experiments using Stochas- tic Search for Text Planning. Procs of the Ninth International Workshop on Natural Language Generation (INLG-98), Niagara-on-the-Lake, pages 98-107.
- Mellish, C., R. Evans, L. Cahill, C. Doran, D. Paiva, M. Reape, D. Scott, and N. Tipper. (2000) A representation for complex and evolving data dependencies in generation. Procs of the Language Technology Joint Conference, ANLP-NAACL2000, Seattle.
- Mellish, C., D. Scott, L. Cahill, R. Evans, D. Paiva and M. Reape. (2004) A Reference Architecture for Generation Systems. Natural Language Engineering, this issue.
- Nirenburg, S., V. Lesser and E. Nyberg. (1989) Controlling a Language Generation Planner. Procs of the 11th International Joint Conference on Artificial Intelligence (IJCAI-89), Detroit, pages 1524-1530.
- O'Donnell, M., A. Knott, C. Mellish, and J. Oberlander. (2001) ILEX: The architecture of a dynamic hypertext generation system. Natural Language Engineering, 7:225-250.
- Power, R. (2000) Planning texts by constraint satisfaction. Procs of the 18th International Conference on Computational Linguistics (COLING-2000), Saabruecken, pages 642- 648.
- Radev, D., N. Kambhatla, Y. Ye, C. Wolf and Z. Wlodek. (1999) DSML: A Proposal for XML Standards for Messaging Between Components of a Natural Language Dialogue System. Procs of the AISB'99 Workshop on Reference Architectures and Data Standards for NLP, Edinburgh.
- Reiter, E. (1994) Has a consensus NL generation architecture appeared and is it psycholin- guistically plausible? Procs of the 8th International Workshop on Natural Language Generation (INLG-94), Kennebunkport, pages 163-170.
- Reiter, E. (2001) Pipelines and Size Constraints. Computational Linguistics 26:251-259.
- Robin, J. and K. McKeown. (1996) Empirically designing and evaluating a new revision- based model for summary generation. Artificial Intelligence, 85(1-2).
- Rubinoff, R. (1992) Integrating Text Planning and Linguistic Choice by Annotating Linguistic Structures. In R. Dale, E. Hovy, D. Roesner and O. Stock (Eds) Aspects of Automated Natural Language Generation, Springer Verlag, pages 45-56.
- Seki, Y. (2001) XML Transformation-based three-stage pipelined Natural Language Gen- eration System. Procs of the Sixth Natural Language Processing Pacific Rim Symposium (NLPRS 2001), Tokyo, pages 767-768.
- Text Encoding Initiative. See http://www.tei-c.org/.
- Wahlster, W. (2001) Robust Translation of Spontaneous Speech: A Multi-Engine Ap- proach. Procs of the Seventeenth International Joint Conference on Artificial Intelli- gence (IJCAI-2001), Seattle.
- Wanner, L. and E. Hovy. (1996) The HealthDoc Sentence Planner Procs of the Eighth International Natural Language Generation Workshop (INLG-96), Herstmonceux, pages 1-10.
- Wilcock, G. (2001) Pipelines, Templates and Transformations: XML for Natural Language Generation. Procs of the first NLP and XML Workshop, Tokyo, pages 1-8.