Document structure and multilingual authoring
2000, Proceedings of the first international conference on Natural language generation - INLG '00
https://doi.org/10.3115/1118253.1118258Abstract
The use of XML-based authoring tools is swiftly becoming a standard in the world of technical documentation. An XML document is a mixture of structure (the tags) and surface (text between the tags). The structure reflects the choices made by the author during the top-down stepwise refinement of the document under control of a DTD grammar. These choices are typically choices of meaning which are independent of the language in which the document is rendered, and can be seen as a kind of interlingua for the class of documents which is modeled by the DTD. Based on this remark, we advocate a radicalization of XML authoring, where the semantic content of the document is accounted for exclusively in terms of choice structures, and where appropriate rendering/realization mechanisms are responsible for producing the surface, possibly in several languages simultaneously. In this view, XML authoring has strong connections to natural language generation and text authoring. We describe the IG (Interaction Grammar) formalism, an extension of DT-D's which permits powerful linguistic manipulations, and show its application to the production of multilingual versions of a certain class of pharmaceutical documents.
References (17)
- Coch. 1996. Evaluating and comparing three text production techniques. In Proceedings of the 16th International Confe~vnce on Computational kin- guistics.
- OVP l~ditions du Vidal, editor. 1998. Le VIDAL de la famille. HACHETTE.
- M. Dymetman. V. Lux, and A. Ranta. 2000. XML and multilingual document authoring: Conver- gent trends. In Pro,'eedings Coling 2000, Saar- brficken.
- A. Hartley and ('. Paris. 1997. Muhilingual docu- ment production-: from supporl for translating to support for authoring. In Machine Translation, Special Issue. on New Tools for Huma n TranslaT,.. tots, pages 109-128.
- L. Magnusson and B. Nordstr6m. 1994. The ALF proofeditor and its proof engine. In Lecture Notes in Computer Science 806: Springer.
- P. Martin-L6f. 1984. Intuitionistic Type Theory. Bibliopolis, Naples.
- P. M/ienp/ii and A. Ranta. 1999. The type theory and type checker of GF. In Colloquium on Prin- ziples, .Logics, ..and Implementations .of High-Level Progrdmm.ihg L~inTJages, Worl~shop: On-Logical Frameworks and Meta-languages, Paris, Septem- ber. Available at http ://www. cs. chalmers, se/ ~aarne/papers/Ifm 1999. ps. gz.
- W. Pardi. 1999. XML in Action. Microsoft Press.
- Fernando C. N. Pereira and David H. D. Warren. 1980. Definite clause grammars for language anal- ysis. Artificial Intelligence, 13:231-278.
- R. Power and D. Scott. 1998. Multilingual au- thoring using feedback texts. In Proceedings of the 17th International Conference on Computa- tional Linguistics and 36th Annual Meeting of the Association for Computational Linguistics, pages 1053-1059.
- P. Prescod. 1998. Formalizing SGML and XML instances and schemata with forest automata theory. http ://www. prescod, net/forest/shorttut/.
- A. Ranta. Grammatical Framework work page. http ://www. cs. chalmers, se/ aarne/GF/pub/work-index/index, html.
- E. Reiter. 1995. NLG vs. templates. In Proceedings of the 5th European Workshop on Natural Lan- guage Generation (EWNLG '95), pages 95-106, Leiden.
- W3C, 1998. Extensible Markup Language (XML) 1.0, February. W3C reconunendation.
- W3C, 1999a. XML Schema -Part 1: Structu~vs, Part 2 : Datatypes -, December. W3C Working draft.
- W3C, 1999b. XSL Transformations (XSLT), November. W3C recommendation.
- D. Wood. 1995. Standard Generalized Markup Lan- guage: Mathematical and philosophical issues. Lecture Notes in Computer Science. 1000:344-- 365.