What can linguistics contribute to event extraction
2006
Abstract
This paper examines the question of how a linguistic analysis of a written document can contribute to identifying, tracking and populating the "eventualities" that are presented in the document, either directly or indirectly, and representing degrees of belief concerning them. It is our view that the role of lexical analysis (as exemplified in the research carried out in the FrameNet project) is greater than usually assumed, so this paper is partly an attempt to clarify the boundary between on the one hand the information that can be derived on the basis of linguistic knowledge alone (composed of lexical meanings and the meanings of grammatical constructions) and on the other hand, reasoning based on beliefs about the source of a document, world knowledge, and "common sense". Since the general linguistic processes described in this paper will apply to eventualities in general (by which we mean acts, happenings, states of affairs, and relations, whether real, proposed, imagined, or denied ), our presentation will emphasize the linguistic processes themselves. In particular, we show that the kind of information produced by the lexicon-building project FrameNet can have a special role in contributing to text understanding, starting from the basic facts of the combinatorial properties of frame-bearing words (verbs, nouns, adjectives and prepositions) and arriving at the means of recognizing the anaphoric properties of specific unexpressed event participants, for all parts of speech, in defining a new layer of anaphora resolution and text cohesion. Using as a starting point the challenge text presented in the call for this workshop (hereafter referred to as the Hijacking text), we show the points at which a thorough linguistic analysis can articulate with the kind of simulation formalism demonstrated in X-schema diagram, , which itself incorporates a great deal of world knowledge connected with the events introduced in the Hijacking text.
References (4)
- Chang, N.; Narayanan, S.; and Petruck, M. R. 2002. Putting frames in perspective. In Proceedings of 19th Inter- national Conference on Computational Linguistics. Taipei: COLING.
- Fontenelle, T., ed. 2003. International Journal of Lexicog- raphy, volume 28. Oxford University Press. (Special issue devoted to FrameNet.).
- Narayanan, S., and McIlraith, S. 2002. Simulation, ver- ifi cation and automatic composition of web services. In Proceedings of the Eleventh International World Wide Web Conference (WWW2002), Hololulu, Hawaii, May 7-10, 2002.
- Sinha, S., and Narayanan, S. 2005. Model based answer selection. In Proceedings of the Workshop on Textual Infer- ence, 18th National Conference on Artificial Intelligence. PA, Pittsburgh: AAAI.