Academia.eduAcademia.edu

Outline

An ontology of linguistic annotations

2008, LDV Forum

https://doi.org/10.21248/JLCL.23.2008.98

Abstract

This paper describes development and design of an ontology of linguistic annotations, primarily word classes and morphosyntactic features, based on existing standardization approaches (e.g. EAGLES), a set of annotation schemes (e.g. for German, STTS and morphological annotations), and existing terminological resources (e.g. GOLD). The ontology is intended to be a platform for terminological integration, integrated representation and ontology-based search across existing linguistic resources with terminologically heterogeneous annotations. Further, it can be applied to augment the semantic analysis of a given text with an ontological interpretation of its morphosyntactic analysis.

References (19)

  1. Brants, S. and Hansen, S. (2002). Developments in the TIGER annotation scheme and their realization in the corpus. In Proc. 3rd Conference on Language Resources and Evaluation (LREC-02), Las Palmas de Gran Canaria, Spain.
  2. Broschart, J. (1997). Why Tongan does it differently: Categorial distinctions in a language without nouns and verbs. Linguistic Typology, 1-2:123-166.
  3. Chiarcos, C. (2006). An ontology for heterogeneous data collections. In Proc. Corpus Linguistics 2006, pages 373-380, St.-Petersburg. St.-Petersburg University Press.
  4. Cimiano, P. and Reyle, U. (2003). Ontology-based semantic construction, underspeci- fication and disambiguation. In Proc. Prospects and Advances in the Syntax-Semantic Interface Workshop.
  5. de Cea, G. A., Gómez-Pérez, A., Álvarez de Mon, I., and Pareja-Lora, A. (2004). OntoTag's linguistic ontologies. In Proc. Int'l Conference on Information Technology, Coding and Computing (ITCC'04), pages 124-128, Las Vegas, Nevada.
  6. Farrar, S. and Langendoen, D. T. (2003). A linguistic ontology for the semantic web. GLOT International, 7(3):97-100.
  7. Hughes, J., Souter, C., and Atwell, E. (1995). Automatic extraction of tag set mappings from parallel annotated corpora. In From Text to Tags: Issues in Multilingual Language Analysis, Proc. ACL-SIGDAT Workshop, pages 10-17.
  8. ICOM (2006). ICOM code of ethics for museums. In Hoffman, B. T., editor, Art and Cultural Heritage. Law, Policy and Practice. Cambridge University Press. Chiarcos
  9. Ide, N., Romary, L., and de la Clergeri, E. (2005). International standard for a linguistic annotation framework. In Proc. HLT-NAACL'03 Workshop Software Engineering and Architecture of Language Technology.
  10. König, E., Bakker, D., Dahl, e., Haspelmath, M., Koptjevskaja-Tamm, M., Lehmann, C., and Siewierska, A. (1993). EUROTYP Guidelines. Technical report, European Science Foundation Programme in Language Typology.
  11. Leech, G. and Wilson, A. (1996). EAGLES recommendations for the morphosyntac- tic annotation of corpora. Technical report, Expert Advisory Group on Language Engineering Standards.
  12. Lezius, W., Rapp, R., and Wettler, M. (1998). A freely available morphological analyzer, disambiguator, and context sensitive lemmatizer for German. In Proc. COLING-ACL 1998, pages 743-747.
  13. Monachini, M., Soria, C., and Ulivieri, M. (2005). Evaluation of existing standards for NLP lexica. draft 1.1. Technical report, LIRICS (Linguistic Infrastructure for Interoperable Resource and Systems).
  14. Rehm, G., Eckart, R., and Chiarcos, C. (2007). An OWL-and XQuery-based mechanism for the retrieval of linguistic patterns from XML-corpora. In Proc. RANLP 2007: Recent Advances in Natural Language Processing. Borovets, Bulgaria.
  15. Sampson, G. (1995). English for the Computer. Clarendon Press, Oxford.
  16. Schiller, A., Teufel, S., and Thielen, C. (1995). Guidelines fur das Tagging deutscher Textkorpora mit STTS. Technical report, University of Stuttgart and Universitat of Tübingen.
  17. Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Interna- tional Conference on New Methods in Language Processing, pages 44-49, Manchester,UK.
  18. Schneider, R. (2007). A database-driven ontology for German grammar. In Rehm, G., Witt, A., and Lemnitzer, L., editors, Data Structures for Linguistic Resources and Applications, pages 305-314, Tübingen. Narr.
  19. Stede, M. (2004). The Potsdam Commentary Corpus. In Proc. ACL-04 Workshop on Discourse Annotation, pages 96-102, Barcelona.