Effective parsing with generalised phrase structure grammar
1985, Proceedings of the second conference on European chapter of the Association for Computational Linguistics -
https://doi.org/10.3115/976931.976939…
5 pages
1 file
Sign up for access to the world's latest research
Abstract
Generalised phrase structure grammars (GPSG's) appear to offer a means by which the syntactic properties of natural languages may be very concisely described. The main reason for this is that the GPSG framework allows you to state a variety of meta-grammatical rules which generate new rules from old ones, so that you can specify rules with a wide variety of realisations via a very small number of explicit statements. Unfortunately, trying to analyse a piece of text in terms of such rules is a very awkward task, as even a small set of GPSG statements will generate a large number of underlying rules.
Related papers
Proceedings of the 19th annual meeting on Association for Computational Linguistics -, 1981
SYNTAGMA is a rule-based parsing system, structured on two levels: a general grammar and a language specific grammars. The general grammar is implemented in the program; language specific grammars are resources conceived as text files which contain a lexical database with meaning-related grammatical features, a description of constituent structures, a meaning-specific syntactic constraints database and a semantic network. Since its theoretical background is principally Tesnière's Éléments de syntaxe, SYNTAGMA's grammar emphasizes the role of argument structure (valency) in constraint satisfaction, and allows also horizontal bounds, for instance treating coordination. Notions such as traces, empty categories are derived from Generative Grammar and some solutions are close to Government & Binding Theory, although they are the result of an autonomous research. These proprieties allow SYNTAGMA to manage complex syntactic configurations and well known weak points in parsing engineering. An important resource is the semantic network, which is used by SYNTAGMA in disambiguation tasks. In contrast to statistical and data driven parsers, the system's behavior may be controlled and fine-tuned, since gaps, traces and long-term relations are structurally set and its constituent generation process is not a linear left-to-right shift and reduce, but a bottom-up, rule driven procedure.
2004
The paper outlines a hybrid architecture for a partial parser based on regular grammars over XML documents. The parser is used to support the annotation process in the BulTreeBank project. Thus the parser annotates only the 'sure' cases. To maximize the number of the analyzed phrases the parser applies a set of grammars in a dynamic fashion. Each grammar determines not only the constituent structure (plus some syntactic dependencies internal to the structure), but also a description of the local and global context of the recognized phrase. The grammars available to the parser are arranged in a network. The order of the grammars application depends on the initial ordering in the network and the descriptions associated with the grammars. Thus the traverse is not deterministic. Additionally, the application of the grammars can be interleaved with the applications of other XML tools like remove, insert and transform operations. This architecture provides a flexible means for g...
This paper describes work on the linguistic analysis of texts within a project devoted to knowledge acquisition from text. We focus on syntactic processing and present some key elements of the project's parser that allow it to deal successfully with technical texts. The parser is fully implemented and tested on a variety of real texts; improvements and enhancements are in progress. Because our knowledge acquisition method assumes no a priori model of the domain of the source text, the parser relies as much as possible on lexical and syntactic clues. That is why it strives for full syntactic analysis rather than some form of text skimming. We present a practical approach to four acknowledged difficult problems which to date have no generally acceptable answers: phrase attachment; time constraints for problematic input (how to avoid long and unproductive computation); parsing conjoined structures (how to preserve broad coverage without losing control of the parsing process); and the treatment of fragmentary input or fragments that are a by-product of a fallback parsing strategy. We review recent related work and conclude by listing several future work items.
2010
In the context of natural language processing, the term parsing refers to the process of automatically analyzing a given sentence, viewed as a sequence of words, in order to determine its possible underlying syntactic structures. Parsing requires a mathematical model of the syntax of the language of interest. In this chapter, these mathematical models are assumed to be formal grammars.
Tal Traitement Automatique Des Langues, 2005
We investigated the efficacy of beam search parsing and deep parsing techniques in probabilistic HPSG parsing. We first tested the beam thresholding and iterative parsing. Next, we tested three techniques originally developed for deep parsing: quick check, large constituent inhibition, and hybrid parsing with a CFG chunk parser. The quick check, iterative parsing and hybrid parsing greatly contributed to total parsing performance. The accuracy and average parsing time for the Penn treebank were 87.2% and 355 ms. Finally, we tested robustness and scalability of HPSG parsing on the MEDLINE corpus consisting of around 1.4 billion words. The entire corpus was parsed in 9 days with 340 CPUs. RÉSUMÉ. Nous avons étudié l'efficacité de l'analyse de beam search et des techniques de l'analyse profonde dans le probabiliste HPSG analyse. D'abord, nous avons examiné le beam thresholding et l'analyse itérative. Ensuite, nous avons examiné trois techniques développées originalement pour l'analyse profonde: quick check, large constituent inhibition, et l'analyse hybride avec la CFG chunk parser. Le quick check, l'analyse itérative et l'analyse hybride contribuaient considérablement à la performance de l'analyse totale. L'exactitude et le temps d'analyse moyen pour le Penn Treebank étaient 87.2% et 355ms. Finalement, nous avons examiné la robustesse et la extensibilité de HPSG analyse sur le corpus de MEDLINE contenant presque 1.4 milliard de mots. Le corpus entier a été analysé en 9 jours avec 340 CPUs.
This paper presents a strategy for syntactic analysis based on the combination of two different parsing techniques: lexical syntactic tagging and phrase structure syntactic parsing. The basic proposal is to take advantage of the good results on lexical syntactic tagging to improve the whole performance of unification-based parsing. The syntactic functions attached to every word by the lexical syntactic tagging are used as head features in the unification-based grammar, and are the base for grammar rules.
Proceedings of the 21st annual meeting on Association for Computational Linguistics -, 1983
A central goal of linguistic theory is to explain why natural languages are the way they are. It has often been supposed that com0utational considerations ought to play a role in this characterization, but rigorous arguments along these lines have been difficult to come by. In this paper we show how a key "axiom" of certain theories of grammar, Subjacency, can be explained by appealing to general restrictions on on-line parsing plus natural constraints on the rule-writing vocabulary of grammars. The explanation avoids the problems with Marcus' [1980] attempt to account for the same constraint. The argument is robust with respect to machine implementauon, and thus avoids the problems that often arise wilen making detailed claims about parsing efficiency. It has the added virtue of unifying in the functional domain of parsing certain grammatically disparate phenomena, as well as making a strong claim about the way in which the grammar is actually embedded into an on-line sentence processor.
Proceedings of the 13th conference on Computational linguistics -, 1990
In this paper, we propose an optimized strategy, called Bottom-Up Filtering, for parsing GPSGs. This strategy is based on a particular, high level, interpretation of GPSGs. It permiks a significant reduction of fl~e non-determinism inherent to the rule selection process.
2001
2 Context-Free Grammars 13 2.1 Languages 14 2.2 Grammars 17 2.2. 1 Notational convention s 20 2.3 The language of a gramma r 21 2.3. 1 Some basic language s 22 2.4 Parse tree s 24 2.4. 1 From context-free grammars to datatype s 26 2.5 Grammar transformation s 27 2.6 Concrete and abstract synta x 32 2.7 Constructions on grammar s 35 2.7. 1 SL: an exampl e 36 2.8 Parsin g 38 2.9 Exercise s 39

Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
References (6)
- Becket, The Phrasal Lexicon. TINLAP, 1975.
- Gazdar, G. Klein, E., Pullum, G.K., Sag, I.A., Generalised Phrase Structure Grammar. Blackwell, Oxford (in press -1985).
- Marcus, M., A Theory of Natural Language Processing PhD thesis, MIT, 1980.
- Shieber, S.M., Direct Parsing of ID/LP Grammars Linguistics & Philosophy 7/2, 1984.
- Thorne, J.P., Bratley, P. & Dewar, H., The Syntactic Analysis of English By Machine in Machine Intelligence 3, ed. Michie, Edinburgh UP, 1968.
- Thomson, H. Handling Metarules In A Parser For GPSG DAIRP 175, University of Edinburgh, 1982.